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INTRODUCTION BY THE GUEST EDITOR 


OR the second time this Journal is devoting an entire issue 
to problems of radio research. The first radio issue 
appeared in February 1939 and contained an outline of 

the entire field as it could be visualized at that time. Since 
then at least three major trends in radio research have become 
more noticeable. Studies on the effect of radio are moving 
into the foreground; material collected for commercial pur- 
poses is ever so more frequently available for scientific analy- 
sis; and related areas such as reading research are developing 
so fast that the discipline of general communications research 
seems in the making. This introduction attempts to relate 
the papers of this issue to these three trends. 

The most outstanding new development is the beginning of 
a systematic study of the effects of radio in various fields. 
Social scientists have long felt that the stimulus-response ap- 
proach should be central in all research concerning human be- 
ings, but that this problem is usually tackled under laboratory 
conditions so ‘‘unnatural’’ that the results are never satisfying. 
Radio has now made the whole nation an experimental situa- 
tion. A rather centrally controlled industry provides a variety 
of stimuli, the reactions to which can be studied and compared 
in all groups of the population. 

So far, because of the ownership structure of American 
radio, commercial effects are given most attention, but the 
methods developed here will prove to be useful in other fields. 
The four papers beginning this issue show four different ways 
to study commercial effects. The first (Stanton) uses the 
objective relationship between listening to a program and 
ownership of the product advertised. The second paper 
(Smith-Suchman) starts with the same correlation but adds 
interviews to determine why the respondent bought the pro- 
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duct. A comparison of these two methods in a special case, 
permits an appraisal of the validity of the retrospective inter- 
view. The third paper (Fleiss) deals only with retrospective 
reports of respondents, but shows how greatly their utility 
can be increased if the interviews are made in the semi-experi- 
mental setting of a ‘‘panel.’’ The fourth paper (Erdelyi) 
uses a refined objective comparison. The frequency with which 
songs are played over the radio is related to the frequency with 
which they are sold in the form of sheets ; this approach reveals 
the time lapse between the peak of radio’s influence and the 
peak of its effect. 

Whereas commercial effects occupy most attention now be- 
cause money to carry through the experiments is available, 
educational effects are next within reach for research, because 
here the schoolroom provides a ready-made laboratory. Three 
further papers in this issue deal with educational effects of 
broadeasting (Miles and Ried) or with effects which can be 
measured in the schoolroom (Wiebe). The studies exemplified 
by the first two sections of this issue are only a beginning. 
They show that effects do occur and that they can be measured. 
The next task is to compare effects of different magnitude and 
to study the conditions which account for their variations. 

It should not be assumed that the laboratory techniques of 
experimental psychology must be neglected in radio research. 
The ‘‘program analyzer’’ reported in the program research 
section (Peterman, Schwerin, Daniel) is an adaptation of the 
well-known polygraph, for finding those parts of a program 
which are liked and disliked. This problem is, so to speak, the 
other side of the effect-question. We not only want to know 
what effects certain broadcasts have, but we want also to know 
why they have them, in terms of attributes of the program. 
Program research has been rather neglected so far. The speed 
with which the industry works and the ease of accumulating 
rules of thumb out of daily experience have made systematic 
program-testing appear less urgent. As the radio industry 
settles down, however, there will be more interest in the psy- 
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chology of program construction just as mass production indus- 
tries, in settling, became interested in efficiency studies of all 
kinds. ; 

Some time ago the psychologist’s main source of informa- 
tion was the freshman newly enrolled in universities. The 
plethora of field work now in progress everywhere offers much 
greater opportunity to solve problems which have vexed the 
technicians. Wherever the habits and attitudes of people are 
studied, two basic problems are paramount: how respondents 
should be sampled and how they should be asked for the infor- 
mation which is needed. Two papers (Suchman-McCandless 
and Gaudet-Wilson) deal with especially urgent sampling prob- 
lems. The first discusses the bias introduced using mailed 
questionnaires, the second the representativeness of incomplete 
returns in personal interview surveys. As far as we know, it 
is the first time that evidence on these two points is available. 

As to the getting of correct information, the task basic for 
all further work is to find out the communcations to which peo- 
ple are actually exposed. Here radio research lags behind 
reading research which for many years has experimented with 
and used the recognition method for ascertaining people’s 
reading habits. Therefore we have included two contributions 
(Lucas and Franzen) which explain and test this recognition 
procedure. Although it is not immediately applicable to 
radio, it should be suggestive for the development of a corre- 
sponding standard method of studying listening behavior. 

The two problems of sampling and asking for listening habits 
are characteristically intertwined in studies where the popu- 
larity of radio stations is to be appraised. Therefore the sec- 
tion on research techniques includes a paper (Lazarsfeld) 
discussing the intricate methodological difficulties implied in 
such surveys. 

The two reports from the field of reading research have been 
included in this issue for still another reason. Radio research 
will not long remain isolated, but will merge into the larger 
stream of communciations research. The methods of radio 
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and reading research are so similar, the social significance of 
reading and listening so interlocking, that a trend toward 
fusion of funds and research institutions will occur. Even 
now the impetuous research activities of the radio industry 
have forced magazine and newspaper publishers to do more 
research than before. The applied psychologist is the natural 
coordinator of those efforts with competitive commercial pur- 
poses but allied scientific aims. 

In any case, psychologists will see material available for the 
asking that was not dreamt of ten years ago. How many psy- 
chologists know that by looking into the trade press or by 
writing to the research departments of the major radio net- 
works they can get more data than they and their students 
together could work up in years? The same is true of maga- 
zines which accumulate not only circulation data, but also 
information on how many and which people actually read 
and like their stories. Of course this type of mass data re- 
quires special methods of statistical handling. Thus a number 
of papers dealing with somewhat advanced methods of sta- 
tisties (Daniel, Franzen, Robinson) also appear in this issue. 
Some readers may find these papers difficult, but others will 
welcome the practical application of methods such as factor 
analysis which have so far been used only for the very specific 
problems for which they were originally developed. 

Most of the papers appearing here come from Columbia 
University’s Office of Raido Research (formerly the Prince- 
ton Radio Research Project) endowed by the Rockefeller 
Foundation to study the meaning of radio in the lives of its 
listeners. Nearly every study undertaken by this office yields 
as a by-product clarification of methodological problems. It is 
a good sign that other agencies, such as the Ohio State Univer- 
sity’s Evaluation of School Broadcasts Project, the Psycho- 
logical Corporation, and individual students have contributed 
their share, showing that radio research, or still better, com- 
munications research, has become an integral part of applied 
psychology. 

















I. Commercial Effects of Radio 


A TWO-WAY CHECK ON THE SALES 
INFLUENCE OF A SPECIFIC 
RADIO PROGRAM 


FRANK STANTON 
Director of Research, Columbia Broadcasting System 


T is often a difficult and complex research operation for a 

manufacturer with a diversified advertising schedule to 

isolate and quantify the sales-effectivness of any single 
medium. An advertiser knows, for example, what he is spend- 
ing in a particular medium. But what, concretely, is he 
getting? Can the influence of his sales messages be measured 
beyond tentative equations and surmises? Up to now, there 
have been very few reports of successful attempts to ‘‘ partial 
out’’ the contributions of a specific advertising effort. 

In the light—or rather in the uncertainty—of this back- 
ground, a test-study conducted last Spring’ for a large food 
manufacturer is directly important to the trade and to the 
marketing and advertising professions generally. This adver- 
tiser sponsored a daytime radio series over a limited CBS net- 
work and he wanted to determine statistically how effective 
that program was. It had been on the air only six months 
when he started the investigation. He wanted to know how 
many units of his product the program was selling which 
would not have been sold otherwise. 

The investigation was carefully worked out in cooperation 
with the manufacturer’s advertising agent and a field research 
agency.” For the experimental sample, two markets were 


1 Rip off the Mask!, Columbia Broadcasting System, Inc., December, 
1939. 6 pp. 

2 For an up-to-date report on experiments in this field, as well as a 
discussion of the various techniques and objectives, see Crossley, Archi- 
bald M., ‘‘ Radio and Sales,’’ Variety Radio Directory, 1940-1941. New 
York: Variety, Inc., 1940. 37-93 pp. 
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selected in which all of the advertiser’s sales factors were 
exactly comparable except that his program was broadcast in 
Market A and was not broadcast in Market B. With both 
markets equal from the standpoint of distribution of the ad- 
vertiser’s product they are almost precisely matched with 
respect to population and to retail outlets in his field. Also 
of importance in appraising the selection of the two test areas 
is the fact that the advertiser’s sales were virtually equal in 
both cities prior to the start of the radio campaign. 








Market A Market B 

Radio Non-radio 

Population, 1930 .... ea. 149,900 127,412 
Retail Sales, 1935* ................ $14,047,000 $11,175,000 
Retail Outlets, 1935* ........ 506 497 





* In the manufacturer’s field. 


Before we examine the objectives and techniques of the 
study, it would be well to point out some peculiarities of the 
product which complicate the survey and make the findings 
even more decisive. The product presents a particularly diffi- 
cult advertising problem because it has no significant visible 
variation in quality from brand to brand. Its sales correlate 
positively with population in every part of the country. It 
is a highly necessary staple product, used (and used in the 
same brands) in shanties and mansions. Under these condi- 
tions the product clearly furnishes a test of the effectiveness 
of advertising. 

CBS commissioned Crossley, Inc., to conduct the survey 
within the framework of three objectives, which would reveal 
the influence of the program in all its implications : 


a. To compare total over-the-counter retail sales of the 
brand among dealers in the radio market, Market A, 
and in the non-radio market, Market B. 

b. And in Market A to find the statistical relationship, if 
any existed, between listening and buying; that is, to 
determine whether families who listened to the pro- 
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gram bought the brand more, less, or as much as fami- 
lies who did not listen. 

e. To find any possible correlation between buying and 
regularity of listening; that is, to discover whether 
families who listened regularly were more or less faith- 
ful buyers than families who listened occasionally. 

To accomplish these objectives, the field investigators visited 

stores, made careful checks upon dealer inventories, peered 
over the retailer’s shoulder at his purchase records and his 
bills of lading; they telephoned families, interviewed them 
personally and inventoried their pantries. Throughout the 
investigation the aim was to measure ‘‘buying behavior’’ in 
terms of actual sales in stores, or by pantry inventories in the 
homes. All opinion questions were avoided. Each investi- 
gator was armed with carefully pre-tested questions to ask 
storekeepers and buyers, a set of checks and cross-checks on 
their larders and listening habits, which when coordinated 
would produce a clear, unmistakable answer to the question. 


GENERAL CONCLUSIONS 


After a month of taking store and home inventories, a month 
of family and dealer interviews, the Crossley organization had 
a three-part report to make to the manufacturer : 


a. Total retail sales of his brand were 88 per cent greater 
among dealers selling his product at its standard price 
in Market A (the radio market) than among similar 
dealers in Market B (the non-radio market). 

b. Sale of the brand was 81 per cent higher than that of the 
next most popular brand among families which 
listened in Market A; but among non-listeners it was 
only 7 per cent higher. 

ce. Sale of the brand was 263 per cent higher than that of 
the nearest competing brand among regular, day-to- 
day listeners in Market A; among occasional listeners, 
it was 59 per cent higher. 


These three points, substantiated in each case by an adequate, 
representative sampling of dealers and buyers in both markets, 
were quantitative conclusions about the sales-impact of the 
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program. In brief, the investigation found that an average 
radio program—with an average audience—sells a branded 
product and builds a measurable degree of buying consistency 
among its listeners. In this case, the selling-edge of a radio 
program was isolated and measured for the sponsor perhaps 
more closely than it had ever been before. It is illuminating 
to examine the results separately and in greater detail. 





Comparison of Dealer Sales in Radio vs. Non-Radio Market 


We have seen that stores in the radio market selling the 
brand at its standard price sold 88 per cent more packages 
than similar stores in the non-radio market. This figure was 
ascertained after a month of careful store checking. 

In each test-market, the sales of 20 per cent of all the retail 
outlets were checked for four weeks. The control factor used 
in selecting the stores to be tested was that those in Market A 
and those in Market B did almost the same total volume of 
business. Another basic criterion was that the number of 
dealers selected from each category (independent grocery, 
chain grocery, independent supermarket, chain supermarket, 
and delicatessen) conform in percentage to the number of out- 
lets of each type in the entire test-area. 

The check of sales was launched by taking the original in- 
ventory of each store at the start of the month: taking the 
inventory by an actual count of the number of packages of 
the brand each grocer had on his shelves and in his store- 
rooms. Investigators did the counting, not the grocer. This 
kind of inventory-taking was repeated at the end of each week 
for a month; and allowance was made for the amount of the 
product received by the retailer each week, through a thorough 
examination of his bills of lading. In that meticulous way— 
and not simply through careless or hurried reports from gro- 
cers—total sale in each market was determined. 

These were the results. In the radio market, 2,455 packages 
were sold by 83 dealers, each of them selling the brand at its 
standard price. In the non-radio market, 89 similar dealers 
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sold 1,407 packages—a ratio of 188 to 100 in favor of dealers 
in the radio market. 





COMPARISON OF DEALER SALES 
IN THE RADIO MARKET VS. THE NON-RADIO MARKET 





188 UNITS OF SALE DER WEEK !N AVERAGE STORES JN RADIO market 











100 in WON-RADIO market 











Expressed In terms of every 190 units of the product sold 
weekly in average stores in the non-radio market. 











Use of Brand among Listeners vs. Non-Listeners 


The correlation between buying and listening within the 
radio market is just as sharply defined as the sales comparison 
for the radio vs. non-radio market. In Market A—the radio 
market—a ‘‘coincidental telephone survey’’ of 4131 families 
(a random cross-section of all telephone fafnilies in the city) 
was made while the program was actually on the air to isolate 
two groups: those who were listening and those who were not. 
This procedure enabled the investigators to return to these 
same families and conduct personal, face-to-face interviews 
with the families which were ‘‘known listeners’’ to the pro- 
gram and with those which were not listening. Among listen- 
ing families, the interviewer found out whether they listened 
regularly or occasionally. Among non-listening families, he 
found whether they listened to it occasionally or whether they 
had never heard the program. Thus, consumer-families were 
segregated into three groups: regular listeners, occasional 
listeners, and non-listeners. 
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The re-interviewed families were first queried about their 
listening habits and classified in the three groups already men- 
tioned. Then the interviewers asked permission to enter the 
pantry to observe for themselves what brand of the product 
was on the shelves. Thus, when the investigators departed, 
they knew whether each respondent listened to the program 
and they knew further what brand of the product each family 
actually had on hand. 

The first comparison which emerged from the call-back 
interview had to do with the buying preferences of the listeners 
(both regular and occasional) as compared with the non- 
listeners. The contrast was sharp. Among listener families, 
it was determined that for every 100 families which used the 
next competing brand, 181 families used the sponsor’s brand— 
a plus for listening families of 81 per cent. On the other hand 
—and this distinction is clearly represented in the chart below 
—among non-listening families the sponsor’s brand had a plus 
of only 7 per cent. 





USE OF THE PRODUCT AMONG 
LISTENERS AND NON-LISTENERS 
BASED UPON HOME INVENTORY CHECKS IN THE RADIO MARKET 


LISTENING FAMILIES WON -LISTENING FAMILIES 
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As a refinement of this comparison the listening families 
were separated into regular and occasional groups to see 
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whether frequent ‘‘advertising impressions’’ had more solid 
sales impact than infrequent impressions. Among regular 
listeners, the reports showed that for every 100 families which 
used the next competing brand, 336 used the sponsor’s brand 
—a plus for regular listeners of 226 per cent. Among occa- 
sional listeners, for every 100 families which used the next 
competing brand, 159 used the sponsor’s brand—a plus for 
occasional listeners of 59 per cent. 





USE OF THE PRODUCT AMONG 
REGULAR AND OCCASIONAL LISTENERS 
BASED UPON HOME INVENTORY CHECKS IN THE RADIO MARKET 


REGULAR LISTENERS OCCASIONAL LISTENERS 






































100 100 
USING USING NEXT USING USING NEXT 
SPONSOR'S BRAND COMPETING BRAND SPONSOR'S BRAND COMPETING BRAND 
SUMMARY 


The results of this two-way check—radio market vs. non- 
radio market plus listeners vs. non-listeners—were consistent 
and significant. They showed that a network radio program 
had a definite measurable effect upon the movement of a 
product over dealers’ counters; 188 packages were sold for 
every 100 that might have been sold without the use of radio. 
In addition, it was demonstrated that listeners to this program 
constituted a more extensive market for the product than did 
families who had never heard this program. Not only did 
more listeners stock the product, but it was found to have a 
greater superiority over the next competing brand among 
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listeners than among non-listeners. Furthermore, frequency 
of listening was found to exert a potent effect upon the buying 
habits of families. More regular listeners stocked the product 
than did occasional listeners. Therefore, the more a family 
listened, the more it was likely to buy.* 


3 Too late to be included in this issue of the JoURNAL or APPLIED Psy- 
CHOLOGY is a more recent study following—except for one additional 
step—the same basic pattern of research described in the second part of 
this report. In the later experiment the author measured the distribution 
of a family of products advertised by a radio program series among regu- 
lar listeners and non-listeners (to the specific program) who were 
‘“matched’’ on the basis of socio-economic status, family size, and general 
exposure to other national advertising media. This refinement in tech- 
nique was invented to eliminate the possible bias or influence of the 
listeners’ purchasing power or exposure to other advertising media on 
comparisons of the distribution of the advertised products among ‘‘regu- 
lar’’ and ‘‘non-listener’’ groups. 

The results of the later study show that for the major product adver- 
tised on the radio program there were 128 ‘‘regular-listener’’ homes with 
the product on the shelves for every 100 matched ‘‘non-listener’’ families 
with the product. This same trend—greater distribution among listener 
families—was also observed for three secondary products advertised by 
the program. Below are the results based on household inventories among 
matched groups of ‘‘regular’’ and ‘‘non-listener’’ families. 


Distribution among matched groups of: 











Product ‘* Regular Listeners’’ ‘*Non-Listeners’’ 
A 44.3% 34.7% 
B 32.7 26.3 
Cc 4.3 1.3 
D 1.5 0.4 
N= 539 539 
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DO PEOPLE KNOW WHY THEY BUY? 


ELIAS SMITH anp EDWARD SUCHMAN 
Office of Radio Research, Columbia University 


be standard way of showing the influence of radio upon 
people’s buying habits consists in presenting a four- 
fold table which divides the sample into listeners and 
non-listeners on the one hand, and buyers and non-buyers on 
the other hand.* For instance, in a telephone survey made in 
March 1938 in Syracuse, New York, people were called up dur- 
ing the time Boake Carter (who was then advertising Philco 
radios) was on the air, and asked for the program to which 
they were listening at the moment of the call, and also the 
make of radio they owned.? The following four-fold table 


TABLE 1 


Relation between Owning a Certain Brand of Radio Set and Listening 
to a Program which Advertises It 











Listened to Owned a Phileo Radio sine 
Boake Carter Yes No 
, ee ee 104 130 234 
Ne Nia 208 322 530 
, SSS Se 312 452 764 





resulted. The tetrachoric correlation of this table is .1. This 
shows a slightly positive relationship, indicating that people 
who listen to Boake Carter are more likely to own a Philco 
radio. 

There are a number of practical and theoretical objections 
to be raised to such a table. Practically, the collection of the 
information necessary to compile such a table is quite expen- 


1 See Stanton, Frank, ‘‘A Two Way Check on the Sales Influence of a 
Specific Radio Program,’’ Journal of Applied Psychology, this issue, p. 
665. ff. 


2 These telephone interviews were made by Market Research Corporation 
of America. 
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sive, especially if the survey is not restricted to telephone 
homes. Theoretically, one is free to challenge the casual inter- 
pretation of such a table: it might be either that people listen 
to Boake Carter because they own a Philco (and not the other 
way around), or that there is a spurious factor involved, for 
example, that wealthier people are more likely to own a Philco 
and also to listen to Boake Carter.® 

Obviously, it would be a great improvement upon this type 
of a survey to ask those people who both own a Philco and 
listen to Boake Carter directly whether the commentator influ- 
enced them in their purchase of a radio set, provided that the 
information so obtained is valid. The problem stated in this 
generality is of momentous importance because it implies the 
whole question of the validity to be attached to people’s ex- 
planations of their actions. It will take years of coordin- 
ated research to give a final answer, and even then it would 
have to be conditional upon many qualifications such as, for 
example, the nature of the product being investigated, the 
time of the interview, ete. The experiment to be reported here 
purports to formulate the problem of this direct interview 
more clearly and to show that its investigation is not so hope- 
less as may often be thought. 

The purpose of the investigation, then, is to find out, from 
a number of people who own Philco radios and listen to Boake 
Carter, the proportion who can be considered to have bought 
their radio because they listen to the program. We shall 
attempt to show how, through the use of these personal inter- 
views, it is possible to approach the same result concerning the 
effectiveness of the Boake Carter advertising arrived at from 
the extensive telephone survey given in Table 1.‘ 


3A matching procedure would meet this objection. 

4 The subjects for this interview study were chosen at random from the 
upper left-hand box of table 1 and a second similar one; this box con- 
tains Phileo owner-Boake Carter listeners found in the telephone survey. 
However, in the interviewing it was necessary to eliminate those indi- 
viduals who had bought their Philcos before ever listening to Boake Carter. 
It was also necessary to eliminate those respondents who upon being inter- 
viewed turned out to be only occasional listeners to Boake Carter. 
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The task divides itself into three steps: 1) getting all possible 
relevant information on each individual purchase; 2) formu- 
lating clearly what is meant by the phrase, ‘‘ A person buys be- 
cause .. .”’; 3) looking for a criterion of validity for the final 
result. 


HOW CAN WE OBTAIN THE NEEDED INFORMATION ? 


In order to get a maximum of useful information, a specific 
kind of questionnaire must be developed. This sort of ques- 
tionnaire has been described elsewhere in its more detailed 
aspects.° Here only a brief topical summary of the elaborate 
form developed in this study is given. The main idea is that 
there are different types of purchasers and the task of the 
interview consists in first ascertaining to which type the spe- 
cific respondent belongs and then asking him those questions 
most closely adapted to his type. The main distinction to be 
made is whether the respondent knew beforehand that he 
wanted a Phileo or whether he decided on the brand only at 
the time of actually purchasing the radio set. 

In the first case the main problem is whether the program 
has influenced his choice of brand from the start. Sometimes 
in beginning the interview with the question, “‘Why did you 
buy this brand?’’ certain respondents immediately refer to 
the advertising. It is more likely, however, that they will 
refer to certain advantages of this brand; in which case the 
interviewer must follow up with the question, ‘‘ Where did you 
learn about those things?’’ At this point some will refer to 
the radio plug, but others will mention friends or other influ- 
ences; then, to make certain that no possibility is overlooked, 
this question must be added: ‘‘Do you remember having read 
or heard anything else about the Philco radio?’’ (If Yes) 
‘“What was it about? Where did you see or hear it?’’ At 
this point we shall have learned about everything a respondent 
of this particular type is able to tell us. 

In dealing with people whose choice of brand was not deter- 


5 Paul F. Lazarsfeld, ‘‘The Use of Detailed Interviews in Market Re- 
search,’’ The Journal of Marketing, July, 1937. 
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mined before the purchase but developed as they shopped 
around, the adequate process of interviewing is somewhat dif- 
ferent. These respondents are likely to tell the interviewer 
about salespeople who influenced them, or about how impressed 
they were by the Philco they saw in a store. To make certain 
that no possibie influence of Boake Carter’s advertising has 
been overlooked, the task of the questionnaire now is to ascer- 
tain very concretely every phase of this shopping process and 
to ask at each point whether any preliminary knowledge of 
Philco’s quality was remembered at the time the decision was 
made. (For example, ‘‘When the salesgirl showed you the 
Philco, was it the first time you had heard about it, or had you 
known about the brand before? If so, from what source did 
you learn about it?’’ ete.) 

There are always specific situations to which the attention 
of the interviewer has to be drawn by the questionnaire. For 
instance, it is important to ascertain whether the respondent 
has established in his mind a range of eligible makes which he 
takes for granted, e.g., that he would choose only a nationally 
known brand. If so, it is important to find out whether previ- 
ous advertising has contributed to putting Philco into this 
eligibility range, even though the respondent may not have 
mentioned it specifically when reporting the purchase. 

The main idea of such a questionnaire, then, is to visualize 
the primary psychological types of purchasers, to anticipate 
for each the role which the advertising under investigation 
can play, and then to ask for each type those questions which 
are most likely to bring out the role it actually did play. 
After having first spoken about the purchase and traced its 
possible determinants, we then turn around and review the 
whole situation once more by starting from the advertising 
itself. A series of questions is asked starting from the more 
general and becoming more and more specific so that no pre- 
ceding answer can pre-judge the next. 


‘‘Can you recall ever seeing or hearing an advertise- 
ment for Phileo before making the final purchase?’’ If 
yes, more detailed information is asked: if no, we pro- 
ceed :— 
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‘*Have you ever heard any Philco advertising over the 
radio?’’ If yes, details are asked: if no, we proceed :— 

**Did you ever listen to Boake Carter? Do you know 
what product he advertises?’’ 

Asa third approach, we try throughout the interview to 
learn what weight the respondent himself gives to the different 
factors mentioned. At several points the question is inserted: 
‘* What influence would you say this (factor just mentioned) 
has had upon your purchase?’’ At the end of the interview 
a question is added which tries to summarize the impression 
which the respondent has given of the total cause of his pur- 
chase as far as the role of the radio program is concerned. 

As a result of each such interview—which, incidentally, is 
not unpleasant for the respondent and does not take very long 
—we have a great amount of structuralized information on the 
purchase; and although the respondent might have forgotten 
or ‘‘repressed’’ a great deal, at least all the screens which are 
so frequently due to misunderstandings have been eliminated 
and all the respondent himself can possibly know about the 
cause for his purchase has been brought to light. 


WHAT IS TO BECOME OF THE INFORMATION THUS 
COLLECTED ? 


We are interested in knowing how many Philco owners have 
been influenced by Boake Carter to buy their radio set. We 
eall ‘‘influenced’’ any person who otherwise would not have 
bought a Phileo. If it were technically possible, an experi- 
ment should be set up dividing people at random into two 
groups; the one group should be made to listen to the program 
and the other to refrain from listening. How many more 
Philco sets would be found in the former group after a cer- 
tain period of time? This experiment, of course, is techni- 
cally impossible. To what extent, then, can the information 
collected by our interview procedure be substituted for the 
experiment? 

In order to answer the above question it is necessary to be 
able to analyze each of the questionnaires in great detail for 
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any influence Boake Carter may have had. The problem to be 
solved in relation to the interview technique, then, is ‘‘ How do 
we know when Boake Carter was the determining factor in 
the purchase?’’ While it is always very difficult to decide 
**yes’’ or ‘‘no’’ as to this influence for a single case, it is quite 
another matter to be able to decide for two cases whether the 
one case is more or less influenced than the other. In other 
words, it is not an impossible task to establish a continuum 
upon which all the cases are ranked from decreasing to increas- 
ing Boake Carter influence. This continuum in turn may 
arbitrarily be broken into any number of groups, depending 
upon how sharply we wish to distinguish one group from 
another. 

As people tell us about their experiences in buying their 
radios, this continuum seems to fall most naturally into four 
groups, graded according to the degree of influence of the 
program. 

Grade 1: People who remember only after being prompted 
that they heard about Phileo on the Boake Carter program. 
Their report clearly shows some other factor to be the out- 
standing influence, and we can find no relation between the 
program and their choice of brand. In most cases, the Philco 
was not decided upon until the actual purchase was made. 

Grade 2: Boake Carter is mentioned by the respondent as a 
source of information in the early parts of the interview; he 
feels that the program may have influenced him somewhat, 
but is sure that he would have bought a Phileo even if he had 
not listened to the program. The presence of other factors 
contributing to the purchase is brought out clearly in the inter- 
view. 

Grade 3: Here the program is mentioned immediately as a 
source of information and definitely as an influence; but there 
are other factors which seem equally important, and no deci- 
sion as to the main influence is possible. The respondent 
himself feels that even without Boake Carter, he might have 
purchased a Philco radio. While the importance of Boake 
Carter is felt, it is not decisive. 
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Grade 4: The importance of Boake Carter is acknowledged 
almost as soon as the interview opens—usually spontaneously. 
The program is mentioned as the main reason for buying a 
Phileo; no other factor is mentioned as decisive and the re- 
spondent states that he would not have bought a Philco if it 
had not been for Boake Carter. 

In relation to our problem of deciding when we can attri- 
bute the purchase to Boake Carter, we see that the real diffi- 
culty comes in determining where to draw the line between 
Grade 3 and Grade 4. The main distinguishing characteristic 
between the two grades lies in their answers to the question, 
‘*Would you have bought a Philco were it not for Boake Car- 
ter?’’ In order to be classified as Grade 4, a respondent would 
have to answer ‘‘No’’ to this question. Let us examine com- 
parative accounts of two respondents, one of whom falls into 
Grade 3 and the other into Grade 4. 


Grade 3 
History of Purchase 
Respondent was dissatisfied with 
old radio and wished to get a more 
up-to-date model. Knew about 
Phileco through the Boake Carter 
program but inquired among 
friends as to how satisfied they 
were with their radios. At the 
time of the purchase itself, they 
were convinced that Phileo was the 
best buy, and went to the store in- 
tending to buy one. Their knowl- 
edge of Phileo is rather limited 
and comes from Boake Carter and 
their friends. They listen to Boake 
Carter regularly and had been do- 
ing so for one year before making 
the purchase. 


Analysis of All Factors 
Recommendation by friends and 
knowledge of Philco through Boake 
Carter were the two influences 
present in this case. Boake Carter 
aroused the interest in Phileo and 
led to further investigation. The 
recommendation by friends con- 


Grade 4 


Respondent could not remember 
exactly why purchase was made at 
that time, stating that he just 
needed a new radio. When they 
went to make the purchase they 
had definite intentions of buying 
a Phileo. They had heard about 
Phileo from friends, through news- 
paper advertisements, and through 
Boake Carter. They definitely 
knew that it was the kind of radio 
they wanted, knew it to have a 
good tone and to be a well-con- 
structed machine. Their knowledge 
of Philco was quite definite, mainly 
because of Boake Carter. They 
listened to him five times a week 
and had been listening to him for 
three years before purchasing the 
Phileo. 


Three possible influences can be 
detected: the friends’ recommen- 
dation, newspaper advertisements, 
and Boake Carter. The good-will 
towards Philco is apparent through- 
out the interview and is due prob- 
ably to all three of the above fac- 
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firmed Boake Carter’s advertising 
and led to the purchase. 


Determination of Main Influence 
The fact that the respondent 
sought confirmation of Boake Car- 
ter’s statement shows that al- 
though this influence was present 
and important, it was not the de- 
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tors. However it is also apparent 
from the respondent’s own state- 
ment that this good-will was due 
mainly to Boake Carter. The fol- 
lowing statement shows this 
clearly: ‘‘My attention was first 
called to Phileo through Boake 
Carter’s program and I got to 
thinking that Phileo must be a 
pretty good buy. Boake Carter 
has a good program and is an ex- 
pert commentator. I don’t think 
he would advertise a product that 
wasn’t excellent or work for a 
company that wasn’t completely 
reliable.’’ 


This statement by the respondent 
indicates clearly the main influ- 
ence. In conclusion, the respondent 
states that were it not for Boake 
Carter he probably would not have 


termining one. Were it not for 
favorable reports from various 
friends, the respondent would prob- 
ably not have bought a Phileco. 
While both factors are of great 
importance, the determining factor 
seems to be ‘‘recommendation.’’ 


bought a Philco. 


From the above comparison it can be seen how we can place 
one respondent higher on the scale of Boake Carter influence 
than the other. While the decision that one is definitely a 
Boake Carter influence and the other is not must be arbitrary, 
we feel that if we were to limit the Boake Carter influence to 
Grade 4 only, we would be getting at those people who really 
‘‘would not have bought a Phileo without Boake Carter adver- 
tising it.’’ 

In regard to all four grades, a careful analysis of all the 
cases resulted in the following distribution : 

It is the cases of influence-degree #4 which seem to cor- 
respond to what we would call colloquially, purchases caused 
by Boake Carter’s advertising. We are therefore inclined to 
state that of these 155 cases, 19 per cent were induced by radio 
advertising to buy a Philco set. This means that, to the best 
of our general psychological knowledge, we feel that this pro- 
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TABLE 2 


Degree of Influence of Boake Carter’s Program On 155 Purchases 
of Philco Radio Sets 

















roportion 
Degree “eae 
1 (lowest) 48% 
_ tees LD RRR Fee el site 20 
fh SS De ae ean 13 
IED ceasacceictecinececrsncocntion Seen a 19 
Fe OE II ota cieicctitnerenncicn 100 
"Dotall mrrmber OF CRBC nnnannnacencceccesccscnssecescsccessroesernsssssees 155 





portion of respondents would not have bought a Philco if it 
had not been for Boake Carter’s program. 

Practically, the figure is of great importance as soon as we 
add two other items of information available from the records 
of most advertisers. If the approximate size of the audience 
to a program and the number of Philco owners in the total 
population is known, it is possible to calculate whether the 
particular advertisement is paying its way in view of the 
knowledge that 19 per cent of this audience would not buy the 
product were it not for the program. Restricting ourselves 
here to the psychological aspect of the problem, we do not 
enter into the somewhat intricate statistical considerations 
which are suggested by this last remark. 


HOW CAN THE RESULT BE TESTED FOR VALIDITY ? 


There is one relatively simple way of doing this. It con- 
sists in using several judges for the classification of the inter- 
views. Two additional judges were made thoroughly familiar 
with the criteria we used to ascertain the highest degree of 
influence, and then were asked to classify the cases in their 
own way. One ended up with 21 per cent of the purchases 
induced by the program and the other with 18 per cent. 
These figures do not deviate too much from the 19 per cent 
found by the first judge. 
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But it can be objected that this is only an agreement on bias 
because all three judges proceeded on the same basis of a com- 
mon sense analysis. It would be highly desirable to find more 
objective tests which could be applied to determine validity 
in such eases. 

The best type of check would be a tie-in with one of the 
major advertising tests being made from time to time by 
business agencies. Two test cities are usually selected in only 
one of which an advertising campaign is put on. Then the 
sales are observed and if they rise in the city covered by the 
campaign the advertising is considered promising. It would 
be highly desirable to interview those people in the campaign 
city who bought the product and ask them why they bought 
it. The number of people who, on the basis of interviews 
such as the one discussed in this paper are finally considered to 
have been influenced by the program should be equal to the 
difference in customers between the campaign city and the 
control city. 

As a substitute for such a procedure we used a situation 
which offered itself through a lucky coincidence. In the 
Spring of 1938 the change from standard to daylight time 
forced a change of time in the Boake Carter program which 
we were studying. The commentator came on an hour and a 
quarter earlier than he had been broadcasting previously. 
When another telephone survey was made two weeks after the 
change in time, the correlation between Phileco ownership and 
Boake Carter listening was higher. In a four-fold tabulation 
of 773 cases parallel to the one given at the beginning of this 
paper, the tetrachoric correlation between Philco ownership 
and Boake Carter listening was .3 whereas it was .1 just two 
weeks previous to the change of time.* There were, then, two 


6 The reason for this correlation being higher after the change of time 
than before does not make any difference for the purpose of this study. 
The most probable interpretation which comes to mind is that the more 
loyal Boake Carter listeners are more likely to adjust their own schedules 
to the new time of the broadcast; since they are more loyal to the program 
they might easily be influenced by it. So after the change of time we 
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samples of people who showed to varying degrees an objective 
association between two characteristics. If our interviews 
and the classification based upon them were valid, then in the 
second sample more people should report that they were 
induced by Boake Carter to buy a Philco.’ 

And indeed it turned out that according to our interviews 
15 per cent of the first group were classified as ‘‘buying be- 
cause of Boake Carter’’ and 23 per cent of the second 
group.® 

We have here, then (for the first time as far as we know), 
an objective test of the validity of direct interviews intended 
to ascertain the influence of a specific series of advertisements 
upon the buying habits of people. The objective test and the 
results of the interviews corroborate one another. 


SUMMARY 


Two samples were available, each of which could be ar- 
ranged in a four-fold tabulation indicating the correlation 
between listening to a commentator program and owning the 





find a higher correlation between listening and ownership. However that 
may be, for our present purpose the only information needed is the very 
fact that we have here two samples, with one showing a higher correlation 
than the other. Incidentally, it is from these two samples that our 155 
cases were selected for personal interview. See footnote 4. 

7 This expectation would be justified only if none of the spurious factors 
mentioned at the beginning of this paper entered into those two correla- 
tions. Fortunately this seems to be the case. By including telephone 
subscribers only, we eliminate the lower economic half of the population, 
which, it is well known, listens much less to commentators. For the tele- 
phone subscribers there appeared no significant difference in Boake Carter 
listening or in Phileo ownership when special tabulations for different 
economic groups were made. Furthermore, twenty-three Phileo owners 
who started to listen to Boake Carter after buying the Phileo were inter- 
viewed as to whether ownership made them more inclined to listen when 
the program came on; no trace of such a direct relationship (listening in- 
fluenced positively by ownership) could be found. 

8 For this test the interviews taken in the first sample consisted of 78 
eases and, in the second sample, of 77 cases. The difference is statisti- 
cally reliable. 
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make of radio set which this program advertised. The tetra- 
choric correlation in the one sample was .1 and in the second 
sample, .3. In each sample the people who listen to the pro- 
gram and own the advertised brand were interviewed as to 
how they came to buy their radio sets. The answers were 
classified into four degrees of sales influence which could be 
attributed to the commentator. Degree 1 indicated no influ- 
ence at all, whereas degree 4 was given to those cases where 
three judges agreed that this special make of radio would not 
have been bought were it not for the influence of the program. 
The purchasers of influence degree 4 could therefore be con- 
sidered those who, we would say colloquially, bought their sets 
because of the commentator’s influence. 

The main purpose of the study was to see whether the sta- 
tistical result of such judgments would be corroborated by the 
objective correlation between ownership and listening. There- 
fore, after the classification of our cases into influence grades 
was made, the cases were divided into those coming from the 
first sample and those from the second sample. In the sample 
having the correlation .1, we found 15 per cent who had 
‘“bought because of the radio program,’’ whereas in the sample 
with the .3 correlation, we found 23 per cent of these cases. 
This result was taken as one piece of evidence that by appro- 
priate direct interviews we can measure the selling influence 
of a radio program, if by this influence we understand the 
difference in sales between a group which is and a group which 
is not exposed to this program. 




















THE PANEL AS AN AID IN MEASURING 
EFFECTS OF ADVERTISING 


MARJORIE FLEISS 
Office of Radio Research, Columbia University 


N studying the effects of radio advertising upon consumer 
purchases, researchers have been blocked by two obstruc- 
tions. The first is the respondent’s vague memories of the 

circumstances in which she changed brands; the second is the 
inadequacy of questionnaires in ferreting out all influences, 
even when investigators are lucky enough to find recent 
changes. 

The first obstruction can be skirted by the use of repeated 
interviews that would enable the investigator to spot a brand 
change shortly after it had taken place. The second obstruc- 
tion can be reduced by the use of detailed questionnaires and 
an analysis of the types of interviewing errors to be avoided. 
A small trial panel’ of housewives was therefore set up, under 
the direction of Frank Stanton, Research Director for the 
Columbia Broadcasting System. The housewives cooperated, 
ostensibly only to permit a weekly pantry check of specified 
products. In this way, an interviewer could spot a brand 
change shortly after it had taken place and follow this lead 
with elaborate why questions. 

The products to be checked in this test were toilet soap, pack- 
aged soap, white bread, cold cereals, tomato soup, vegetable 
soup, coffee, and cigarettes. Interviews were to be held each 
week for four weeks; then the advisability of scheduling addi- 
tional interviews was to be considered. 

A total of thirty-nine housewives agreed to be interviewed. 
Most of them lived in Long Island City; all were in middle 


1 See: Daniel, Cuthbert, How to Run a Panel, F.R.E.C. Pamphlet, Wash- 
ington, D. C., 1940. 
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class homes; most had no ’phones; none had college educa- 
tions ; their ages varied from the twenties to the fifties. Three 
women dropped out completely—one left town after the first 
interview; one refused after the second, the other <.zer the 
third. Three other women missed one of the first four inter- 
views. 

From the remaining thirty-three women, complete sets of 
four weekly interviews were secured. 


FREQUENCY OF CHANGES 


Table 1 presents the number and rate of brand changes 
that occurred in the thirty-three households where complete 
sets of four weekly interviews were obtained. 


TABLE 1 


Number of Changes per Product in 33 Households 
for First Four Interviews 








Number of Changes per 
Product* households that“ ymper of household 
use product per week 
Toilet soap .............. 33 12 12 
Packaged soap ........ 32 | 6 06 
White bread ........... 25 19 25 
Cold cereals ....ccccccon 28 31 37 
AS 31 12 13 
Cigarettes  nccccccccnoon 14 12 29 
a 92 1.22 





* After the first interviews, it was observed that practically none of the 
respondents used vegetable soup. This product was therefore dropped 
after the second interview. Because tomato soup was also stocked by 
less than half the women, tomato juice was inventoried instead, after 
the second interview. Since these products were checked only twice and 
the number of households using them was not sufficiently great to war- 
rant their inclusion, they were omitted from Table 1. 


Among housewives who use the six products listed, we might 
expect an average of 1.22 changes from week to week. From 
100 housewives, we should expect about 122 changes a week. 
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But is a week the proper time period between interviews? If 
instead of holding four interviews, we had skipped the second 
and third, should we have lost many changes? Table 2 shows 
that if the middle two interviews had been omitted, that is, 
only one-third the number of recalls had been made, the effi- 
ciency of obtaining changes? would have been doubled. With 
three recalls, 1.22 changes were obtained per household per 
recall; with one recall over the same period of time, 2.40 
changes were obtained per household per recall. 


TABLE 2 


Number of Changes per Product in 33 Households 
if Second and Third Interviews Had Been Omitted 











Porallfourintarions Tf scond snd hd ner 

Product No. of Changes per No. of Changes per 

dom oa household cheno household 

8 per recall 8 per recall 

Toilet soap. ......... 12 12 11 33 
Packaged soap 6 06 4 13 
White bread ..... 19 25 14 56 
Cold cereals ..... 31 .37 22 .79 
se 12 13 5 16 
Cigarettes ........... 12 29 6 43 
92 1.22 62 2.40 





Note that the sample in this trial study was small and that 
the rate of change would probably be different for another 
sample as well as for different times of the year. Still there 
is little reason for doubting that three weeks would be a more 
economical interviewing interval than one week. Assuming 
that the rates of change compared in Table 2 remained con- 
stant, we could expect 1098 changes from 100 women whose 
pantries were inventoried 10 times, a week apart ; 2160 changes 
from the same number of women called upon the same number 
of times, but at three-week intervals. 


2 In a series of interviews, the first becomes insignificant since it is used 
only as a starting point and no changes can be recorded then. 
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Further investigation would be necessary to determine 
whether a still longer interval would be better and whether 
the memory loss would not be too great. According to inves- 
tigators’ impressions, a fifth interview, conducted three weeks 
after the fourth, seemed to be no more difficult than the pre- 
vious ones, but these impressions are by no means conclusive. 

As indicated before, the rate of change will not always 
remain the same from season to season. The fifth interview, 
held after the Fourth of July, showed more than twice as many 
changes as the previous interview covering an equal length of 
time. For the six products listed in Table 2, 66 changes were 
noted, whereas only 31 were recorded for the three-week inter- 
val between the first and the fourth interviews. This increase 
in number of changes could be due partly to chance, but a 
close-up of the questionnaires showed that much of the increase 
was prompted by the oncoming of summer. Whatever the 
explanation, this increase indicates that our first estimate is 
conservative. 

TYPES OF CHANGES 


So far, only the total number of changes has been discussed. 
Now it is necessary to differentiate between those changes 
which could have been affected by advertising and those which 
could not. Those which could not have been affected by adver- 
tising can be divided into four sub-groups; those which could 
have been affected by advertising can be divided into three 
sub-groups. Below is a classification of the several types. 
This classification might be modified or expanded in a larger 
study, but it is an inclusive grouping for this preliminary 
investigation. 


I. Advertising Not a Possible Influence 


A. No Preference by Housewife Herself Implied. 

In this group would be included new brands which 
were bought by members of the family other than the 
housewife; brands that were the only ones in the store 
where the housewife’s shopping was then done; gifts; 
samples. 
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. Pattern of Change Previously Established. 

Certain brands of products are used only for limited 
purposes or times. The change follows these established 
uses and no new influence is at work. Again, some 
housewives follow a regular policy of rotation among 
several brands. In so far as they follow the same pat- 
tern, the changes are not then affected by advertising. 

. In or Out of Stock. 

Here are classified those changes where the housewife 
has used up her stock of the brand previously inven- 
toried, and has not replenished it, though she fully ex- 
pects to. If she has just restocked a brand which was 
absent during the previous interview for the same reason 
but had been stocked the time before, such change is also 
placed in this classification. 

. Brand discontinued for Other Reasons, but No Substi- 
tute Made. 

Some women, dissatisfied with an old brand, discon- 
tinue it but make no substitute. They may be discon- 
tinuing the commodity altogether or be using other 
brands of the product now stocked. Those who expect 
to substitute but have not yet purchased one will finally 
make changes which could be influenced by advertising. 
Also included in this group are the cases of discontinu- 
ance, prompted not by dissatisfaction, but a greater 
interest in the other brands stocked. 


II. Advertising a Possible Influence 


. Purchase of New Brand merely because of Interest 
Aroused in It. 

Here there is no dissatisfaction with the old brand but 
information obtained aroused sufficient interest for the 
housewife to buy the new brand. -In some cases, the 
housewife, though not dissatisfied with an old brand, 
likes to change once in a while just for the sake of change. 
She follows no regular policy of rotation but is ready to 
become interested in a new brand anyway. 

As a source of information or influence, advertising 
could be the one, or one among other sources that deter- 
mine the choice. It could be the, or one of the sources 
through which the housewife learned about some in- 
trinsic attributes of the product which appealed to her, 
attributes such as taste or color. Another combination 

















690 MARJORIE FLEISS 


of factors could be extrinsic attributes such as price plus 
the sources of information. Finally, a housewife’s deci- 
sion could be based on her opinions of intrinsic and ex- 
trinsic attributes as influenced by possible sources of 
knowledge. In any of these four combinations, advertis- 
ing could play a role. 

B. Purchase of New Brand because of Dissatisfaction with 
Old. 

Changes of this type are initiated by some dissatisfac- 
tion with the old brand. When the housewife looks 
around for a new brand, then advertising can affect her 
choice. It can play the same roles noted above in II—A, 
that is, it can be the sole factor or one among other 
sources of information, or it can be a contributing factor 
in that it is the source of information for some specific 
attributes which appeal to the housewife. 

C. Purchase of New Brand because of Inability to Get Old. 

The housewife wanted the old brand but was unable to 
obtain it at that time. How did she select the substi- 
tute? Did advertising have anything to do with it? 
Sub-classifications can be made similar to those outlined 
in IT—A and II-B. 


Table 3 presents the distribution of types of changes found 
among the 157 questionnaires* analyzed. Of these, 18 per 
cent were of Type II, where advertising could have been an 
influence.* 


REASONS FOR CHANGES 


Type II changes which can be influenced by advertising 
were sub-grouped according to the difficulty that would be 
encountered in uncovering advertising’s influence. Group A 
would present the least difficulty because the purchase was 
prompted only by an interest in the new brand. The house- 

3 Sum of changes in 33 households for first four interviews, and in 25 
households for fifth interview. This is for all products, including tomato 
soup, tomato juice, and vegetable soup. 

4 According to previous estimate on p. 687, 100 women, re-interviewed 9 
times at three-week intervals would make about 2160 changes. A further 
estimate based on the distribution of changes noted above indicates that 


about 389 of the 2160 changes would be of Type II, where advertising 
is a possible influence. 
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TABLE 3 
Distribution of Types of Changes 























%o Jo 
I. Advertising Not a. Possible Influence ... 82 
A. No preference by housewife herself implied .................... 16 
B. Pattern of change previously established ...................... 18 
RRR Sa A a Re 42 
D. Brand discontinued for other reasons, but no substi- 
tute made .... « 6 
II. Advertising a Possible Imfluemce 2.........ccccccsccsssnssnsennsnntenennene 18 
A. Purchase of new brand merely because of interest 
I I Bair cpetlt lh ccieccninleaniccihtianienledataniaomnenianeioneapiacmaiee 14 
B. Purchase of new brand because of dissatisfaction 
with old ETS ELI 3 
C. Purchase of new brand because of inability to get old .. 1 
100 
Number of cases ..... 157 








wife’s attention would be centered on the way this interest 
had been aroused and she would talk more readily in terms 
of it. Since she was not prompted by dissatisfaction with 
some attribute, she would be less likely to be thinking of the 
compensating attribute of the new brand and more likely to 
be thinking in terms of the influences that led her to buy it. 

Because in Group B of the Type II changes the housewife 
was dissatisfied with the brand she had been using, she would 
be more likely to offer the compensating virtues of the new 
brand as the reason for her change. This type would require 
more cautious follow-up to obtain all sources of information 
which led to the selection of a new brand. One of the sources 
of information might be advertising. 

For Group C, advertising’s influence would be least acces- 
sible to the interviewer. The housewife was just unable to 
get the brand she wanted, a brand with which she was com- 
pletely satisfied. She didn’t want to change and hadn’t had 
her interest aroused in the new brand. Since the purchase 
was just an incidental one to her, she might be less sure of 














692 MARJORIE FLEISS 


how she happened to select the substitute brand and what 
influences may have affected her choice. 

Below is a sample of the reasons for a Type II—A change, 
prompted by interest in the new brand: 


Respondent likes to try new brands of coffee once in a 
while. She was ready for a change and remembered that 
she had intended to try Maxwell House because the radio 
program, Good News, said it was good to the last drop. 
She also remembered reading this in magazines, but 
stressed the program as the main influence. 

In this case advertising’s influence came to the surface 
promptly. Another example of a Type II—A change follows. 
In this one, the housewife was not looking for a change. 
Although advertising was not an influence, we have a clear 
picture of what did influence her and can feel sure that no 
sources of information were missed. 


Respondent’s neighbor had told her several weeks be- 
fore the purchase that the neighbor’s children enjoyed 
Shredded Ralston. Respondent had never heard of it 
before but then decided she would have to try it for her 
children. She had been using only hot cereals and waited 
until it grew warmer before she bought the cold cereal. 

The reasons for this change illustrate some of the details that 
must be known in order to have a complete story, that is, one 
where a sub-surface effect of advertising can be raised. If we 
had not learned that the neighbor was the first source of infor- 
mation, we should wonder whether the respondent had heard 
anything about the brand over the air and whether the neigh- 
bor’s report was not just a reminder. Then too, if we had not 
learned why she waited several weeks before actually buying 
the Shredded Ralston, we could not have been certain that the 
neighbor was the only, rather than just the initial influence. 

In interviewing about Type II-B changes, one must not stop 
at the virtues of a brand which make it more desirable than 
the brand discontinued, but must find out where the respon- 
dent learned about these virtues. Below is a sample of the 
reasons for such a change. 
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The housewife complained that her previous brand of 
cold cereal seemed tasteless. Wheaties, however, were 
supposed to be good and tasty. She had known about 
Wheaties for a long time because her son listened to the 
baseball broadcasts. After hearing the broadcasts a 
while, she thought she might try them some time, but 
nothing prompted her to do so until she became dissatis- 
fied with her old brand. 

If the investigator had stopped after learning that Wheaties 
were supposed to be good and tasty, he would not have learned 
about the main influence, the baseball broadcasts. Then too, 
if he had not asked about the time-lag between her decision to 
buy Wheaties at some time and her actual purchase, we could 
not be sure that another influence had not been omitted. 

In our sample, there was only one case of the Type II—C 
change, prompted by inability to get the old brand. 


Respondent was unable to get Martinson’s coffee, so she 
bought Beechnut because she had used it before she started 
to use Martinson’s. 

This is an incomplete accounting of reasons. Had the 
respondent previously used any other brand besides Beechnut, 
and if so, what made her choose Beechnut rather than any of 
the others? 

INTERVIEWING PROBLEMS 


The difficulties which face both interviewer and question- 
naire-maker in detecting advertising’s influence can be 
grouped into four classes: . 

1. Mention of an attribute as a reason for choosing a brand, 
without specifying the source of information or influence.® 
An example of an interview where this error was avoided has 
been cited above for a Type II-B change. The respondent 
first spoke of Wheaties as tasty. The investigator then found 
out where she had learned this. 

2. Mentioning some influences or sources of information but 
not all. Respondents frequently feel that if they have men- 


5 See Lazarsfeld, Paul F., ‘‘The Art of Asking Why,’’ National Mar- 
keting Review, I, 1, Summer 1935. 
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tioned one major influence they have told the whole story. As 
the example given above of a Type IJ—A change shows, the 
respondent at first spoke only of the radio program, Good 
News, the source for her information about Maxwell House 
coffee. An additional question reminded her that she had 
also read of the brand in magazines. 

3. Mentioning as a reason a factor which also operates for 
another brand. An example of this kind of omission was 
cited above for the Type II-C change where the housewife 
said she bought the substitute brand because she had used it 
some time before. Another example would be a housewife’s 
offering price as a reason when other brands are equally inex- 
pensive. In such cases, one must then inquire for additional 
reasons that led her to choose this brand, because the reason she 
gives applies to other brands as well. When not readily 
apparent, the interviewer should ascertain whether or not the 
factor is operative, that is, whether it applies to other brands 
as well. 

4. Unexplained time-lag between the effective factors and 
the purchase. If the time-lag remains unaccounted for, one 
may be missing a decisive influence that precipitated the pur- 
chase. The second case reported for a Type III-A change, 
where cool weather accounted for the delay, and the Type 
II-B chance, where previous satisfaction with the old brand 
accounted for the delay, show how such omissions were avoided. 

To make smaller the frequent gaps in information, then, 
questions should be arranged and interviewers advised that 
mention of an attribute, such as taste, is always followed by a 
query about the source of information; that mention of an 
influence is followed by a request for other influences—if 
friends and store clerks are referred to, such replies should 
be checked by asking whether their speaking of the brand 
brought anything to mind which the respondent may have 
heard or read elsewhere; that mention of a factor which may 
be common to more than one brand is followed by questions 
to determine whether or not it is—were there any other brands 
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as cheap !—and if so, what influenced the decision in favor of 
the particular brand chosen; and that any time-lag of more 
than about two weeks between effective reasons for choosing a 
brand and the actual purchase must be explained by the house- 
wife. 


Because of the small sample in this trial, the reasons for 
change were not subjected to statistical analysis. Experience 
in related fields, however, indicates that when enough signifi- 
cant changes are secured, it will be possible to analyze the 
reasons for such changes statistically and to make quantitative 
estimates of the roles of radio and other advertising media. 





SUMMARY 





Repeated interviews (panel) with housewives were tried 
with a small sample as a means of spotting brand changes 
shortly after they occur and thus reducing the memory loss of 
effective reasons for the changes. This technique seems 
worthy of further investigation as a new tool for uncovering 
radio’s influence on consumer purchases. . . . Three weeks are 
more economical than one as an interviewing interval for this 
kind of panel. With three weekly recalls, 92 brand changes 
were noted in 33 households; if only one recall had been made 
at the end of the three weeks, 62 changes could have been 
obtained anyway. . . . The rate of brand stock changes seems 
to vary seasonally. .. . Based on advertising’s possible in- 
fluence, a classification of all brand stock changes was made. 
Of 157 questionnaires analyzed, 18 per cent could have been 
affected by advertising. These were sub-divided according to 
the likely ease with which advertising’s influence could be 
detected. . . . A conservative estimate indicates that 389 such 
changes would be found among 100 women re-interviewed 9 
times at three-week intervals; with a less conservative estimate 
for the same number of interviews and the same six products 
considered in this test, one might expect as many as 906 
changes where advertising could have had an effect... . An 
interviewing method was developed to uncover the actual role 
which advertising played in effecting brand changes. 











THE RELATION BETWEEN ‘‘RADIO PLUGS’’ 
AND SHEET SALES OF POPULAR MUSIC 


MICHAEL ERDELYI 
University of Scranton 


as on the life history of popular songs. Because of 

the great financial interests involved, there are exten- 
sive records kept on the number of times a song is played over 
the air, the number of printed score-sheets sold, the number 
of recordings sold, the number of times a song is played over 
the half-million ‘‘nickelodeons’’ which exist in this country. 
A number of problems on the relationship between supply and 
demand in the field of popular music can be studied with this 
kind of material. The psychological problems involved are 
somewhat similar to the psychology of fashions in clothes, but 
the existing documentation is infinitely richer in the field of 
popular music. 

It is estimated that about fifty to sixty million songs are 
played by the 850 radio stations in this country. Half of 
. these performances are given to about 2 per cent of all the 
songs. About two-thirds of the songs which are hits on the 
radio are also leading according to the other indices such as 
sales, orchestra requests, and so on. 

Simply to give an example of the many kinds of investiga- 
tions which could be carried through, we select the problem 
of the relationship between the frequency with which popular 
songs are played over the air and the amount of sheet sales 
of these songs. All of the songs used were extensively 
plugged’ during one of the two periods, January 14 te April 
8, 1939, and January 13 to April 6, 1940. 

1‘*Plugs’’ is the trade term for performances over the air. 
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(CF few subjects is there so much information available 
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Since each song goes through its cycle of increase, maximum 
and decrease both for plugging and for sales, it is necessary 
to choose those songs which have some considerable frac- 
tion of their life cycle in the period under observation. Those 
songs which showed six or more rankings in each classification 
(plugging and sales) during one of the two periods were used 
in this study. Although 100 different songs appeared in the 
rankings at some time during these periods, only twenty were 
ranked six or more times in both categories. 

A chart was made for each song showing its rank in the 
plugging list week by week and also its rank in the sheet-sales 
list. This chart can be shown as a pair of graphs using as 
abscissae the number of the week (from 1 to 13) and as ordi- 
nates the ranking (from a low rating of 1 to a high rating of 
15, since 15 songs were ranked each way in each published list) 
in the sales or plugging list. 

Table 1 shows the life cycle of the song ‘‘ Indian Summer.’’ 
It is given here as a good example, since it shows more clearly 
than most how plugging increases before sales and falls off 


TABLE 1 
Plugging and Sales Ranking for the Song ‘‘Indian Summer’’ 





Week Number 





.. Lee eee se Ss UMRA BB 





we G2 OS 158 ..... ...... 


7 
Plugging rank... 13.5 9.5 7.5 4.0 3.5 3.0 6.0 2 
2.0 3.0 2.0 1.0 2.0 4.0 3.0 


Sales rank ............... uw 11,0 10.0 8.0 6.0 5.0 





while sales are still high. The fractional rankings mean that 
two songs received the same ranking. 

In order to get a simple index from each chart of the 
expected lag between the plugging and the sales curves, the 
number J was computed. 

I- LWxr 

_——— 
W is the number of each week counting from the beginning 
of the observation period. It runs from 1 to 13. r is the 
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(inverted) rank of the song in the corresponding period. The 
summation in the numerator is carried over all weeks, calling 
r zero in those weeks where no ranking appears. The summa- 
tion in the denominator is carried over all the listed rankings. 
The number J is then a weighted average of the weeks and 
gives the location in the 13-week period of a sort of average 
week. For each song there is a weighted week for plugging 
and for sales. These are tabulated as J, and J, in Table 2. 









































TABLE 2 
Index 
Song l, l, D 
Plugging Sales 

B. TOCpers Creepers nnncccccccescrsereeesnen 4.5 5.2 0.7 
b. This Cam’t Be Love o..ccccccccccccccsssssen 3.6 4.7 1.1 
ec. FDR Jones 2.4 4.2 1.8 
d. Could Be 8.1 9.7 1.6 
ae eS, gh REE rene nen 9.8 8.1 -1.7 
f. I Have Eyes 5.6 9.3 3.7 
g- Umbrella Man 7.6 6.7 — 0.6 
h. Deep Purple 10.1 10.2 0.1 
i. Pemmy Seremade .icccccccccsnsuns 9.5 10.7 1.2 
j. Do I Love You 7.6 10.9 3.3 
k. Dara Tht Deen ............ 8.2 9.2 1.0 
1. In an Old Dutch Garden ................ 6.5 9.7 3.2 
m. Careless 5.9 8.2 2.3 
n. All The Things You Are ............ 4.1 5.4 1.3 
o. Faithful Forever .......................... 3.9 5.9 2.0 
p. Starlit Hour 10.2 11.3 11 
q. It’s a Blue World .......................... 9.7 11.1 14 
> ae eee on 3.5 8.1 4.6 
s. Indian Summer 6.5 9.0 2.5 
t. Oh Johnny Oh 3.5 5.0 1.5 





(Songs a-i—1939; j-t—1940) 


The last column in Table 2 shows the difference between 
the two indices for each song. It is positive when plugging 
leads sales and negative when sales lead plugging. It is ciear 
that this difference is generally positive, and that most of the 
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twenty entries are fairly close to the average value of 1.64 
weeks. This may be taken to mean that the sales curve rises 
after the plugging curve and that the sales curve also declines 
later than the plugging curve. In other words, plugging 
systematically precedes sales. 

Since there is a steady stream of songs, plugging and sales, 
it is clear that the whole life-cycle of plugging and sales for 
all songs is not included in the thirteen-week range used in 
computing these indices. For this reason those songs whose 
plugging index, J, falls outside the range 3.5 to 9.5 were 
grouped together, to see if they show a different time-lag from 
those inside this range. The average lag for the three songs 
with an J, of 3.5 or less, was 2.63 weeks. The average lag 
for the songs in the range 3.5 to 9.5 was 1.84. The correspond- 
ing number for the songs with J, of 9.5 or more was 0.42 weeks. 

Those songs whose ‘‘main plugging week’’ comes before 
week 3.5 are near the end of their cycle. Plugging falls off 
rapidly because centrally controlled. Sales fall off slowly 
because they involve thousands of people who are not centrally 
controlled. Thus a big lag between plugging and sales is to 
be expected. 

The songs with plugging index 9.5 or more are clearly those 
whose cycle was started late in the 13-week observation period. 
The period of reduced plugging is therefore under-represented 
in this group, and the lag represents mainly the difference 
between plugging-on-the-increase and sales-on-the-increase. 

The songs whose main plugging week falls within three 
weeks of the middle of the 13-week period (that is, in the 
range 3.5 to 9.5) show an average lag of 1.84 weeks or roughly 
13 days. This represents the best estimate in the writer’s 
opinion of the mean lag between plugging and sales. It is 
quite certain (P greater than 0.99) that the sales curve lags 
behind the plugging curve. It is also quite certain (P greater 
than 0.95) that the mean lag will fall in the range 2.63 to 1.05 
weeks. Greater precision is not possible on the basis of the 
12 songs whose cycles fall entirely in the two 13-week observa- 
tion periods. 
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The psychological interpretation of the strong influence of 
radio plugs on sheet music sales with a time-lag of about two 
weeks is still a matter of conjecture. It may be that people 
like those songs better the more often they hear them, and that 
it is for this reason that they buy; or possibly the plugging of 
songs over the radio creates a certain social pressure which 
compels people to buy these songs in order to live up to 
the fashion requirements of their group. Although prob- 
ably both processes play a role, there is some inferential evi- 
dence that the social-pressure factor is greater than what 
might be termed the emotional effect of plugging. In the 
experiment reported by Gerhart Wiebe in this issue? it has 
been shown that plugging has only a small influence on liking 
as expressed by rating on a 10-point scale. It is therefore not 
probable that the great influence of radio plugs on sales can 
be explained by the hypothesis that people like the songs so 
much more when they hear them very often over the radio. 
It could still be, of course, that the program-producers know 
what people like and plug songs accordingly, so that the rela- 
tion between plugging and sheet sales might be due to a wise 
selection of material most suitable to plugging. More infer- 
ential evidence from another study, however, would make this 
possibility rather improbable. 

If inherent merits of the song were decisive for whether it 
is heavily plugged, then the musical officers of the radio in- 
dustry should be able to predict what songs will be successful 
in the sense that they will be played very often over the radio. 
In order to test this point, five experts connected with one of 
the major networks (two were staff orchestra leaders, three 
were producers of popular musical programs) were asked* to 
rate 125 unpublished or just published popular songs. Each 
expert heard about 25 songs. Table 3 shows the number of 
songs predicted to have Great, Average, or Little Success, 

2‘*The Effect of Radio Plugging on Students’ Opinions of Popular 
Songs.’’ 

= By Mr. Gerhart Wiebe, in a study done for the Office of Radio 
Research. 
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and the number in each prediction class that actually enjoyed 
in terms of plugging, Great, Average and Little Success. 

The evidence of success in terms of plugging is supplied by 
the lists, published in the theatrical and musical weekly, 
Variety.' This paper lists all popular songs broadcast by the 
major networks more than ten times in the preceding week, 
giving the number of playings for each. Those songs that 
were played more than 30 times in one week, or that appeared 
in the lists twelve or more times were classified as ‘‘Great 
Successes.’’ Songs that appeared in the lists but did not at- 
tain either the 30 per week or the 12 week levels, were classi- 
fied as ‘‘Successes.’’ Those not appearing at all were labelled 
*“No Success.”’ 

TABLE 3 
Experts’ Ratings of Success and Actual Outcomes 





Predicted Success 
Outcome Great Average Little Total 








20 13 44 
14 8 27 
23 25 54 


57 46 125 





This table taken as a whole shows that these experts could 
not predict the success in terms of plugging of these songs.‘ 
Inspection of the table suggests that those songs predicted to 
have Great Success do actually have better than average suc- 
cess, and that those predicted to have less than average success 
seem actually to average lower. Statistical test, however,° 
shows that this distribution does not differ significantly from 
arandom one. There is therefore in these data no statistically 


4A Chi-square calculation shows that the numbers in the table do not 
differ significantly from a set of numbers picked at random to give the 
same totals—that is, the distribution of numbers in the nine cells pro- 
vides no evidence of significant correlation between outcome and pre- 
diction. 

5 A Chi-square calculation as before. 
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significant evidence whatever that persons directly connected 
with the production of popular music programs are able to 
predict the success of popular songs. This makes it probable 
that it is not the inherent merit of the song which decides 
whether it is being plugged or not. A general survey of the 
popular music industry made by the Office of Radio Research 
indicates that it is rather the promotional and financial power 
of the publishing houses to which the radio success of most 
hits can be traced.® 

Taking all the factors discussed in this paper into consider- 
ation it is probable that the effect which radio has upon popu- 
lar consumption in terms of purchases of sheet music is a 
phenomenon of social pressure rather than a selective judg- 
ment by the radio audience. 

6 The present study deals only with the leading hits in regard to both 
radio performances and sheet sales. It is to be expected that for less 
successful but still leading hits the time lag between radio and sales is 
somewhat longer. Finally there is this third of all songs, which are suc- 
cessful only on the radio or in their sheet sales. These songs will permit 


the studies of exceptions, where the regular pressur mechanisme does not 
work. Further studies along these lines are under way. 














II. Educational and Other Effects of Radio 
READING, WRITING, AND RADIO 
A Study of Five School Broadcasts in Literature 


SEERLEY REID 


Research Associate, Evaluation of School Broadcasts, 
Ohio State University, Columbus, Ohio 


URING the year 1938-1939 the Chicago Public Schools, 
Ey through the Chicago Radio Council, produced and 

broadcast to schools each week nine fifteen-minute radio 
programs. One of these series of programs was entitled ‘‘ Let’s 
Tell a Story’’ and consisted of weekly dramatizations of books 
specifically recommended by librarians for seventh- and eighth- 
grade pupils. 

An evaluation of this series—to determine its effectiveness 
in achieving certain educational objectives—was carried on 
cooperatively by Chicago teachers, members of the Chicago 
Radio Council, and members of the staff of the Evaluation of 
School Broadcasts, Ohio State University. The experiment 
was designed to answer the following three questions: 

1. To what extent did this series of programs, plus the teach- 
ers’ utilization of them in the classroom, stimulate the reading 
interests of seventh- and eighth-grade boys and girls? To what 
extent did the classroom listening experience increase the 
number of these interests? To what extent did it increase the 
number of students’ interests in different kinds of stories? 

2. To what extent did this series of programs, plus the teach- 
ers’ utilization of them in the classroom, stimulate the amount 
of reading done by seventh- and eighth-grade boys and girls? 
To what extent did it increase the number of books read? To 
what extent did it increase students’ reading of the specific 
books that were dramatized in the radio broadcasts? 
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3. To what extent did this series of programs, plus the teach- 
ers’ utilization of them in the classroom, stimulate students to 
write more interesting and effective English? 

In addition to these three questions, there were several others 
that grew out of the investigation, namely : 

1. Were there differences between the number of reading in- 
terests expressed by boys and girls? 

2. What kinds of stories were preferred by boys and girls? 

3. Were there differences between the number of books read 
by boys and girls? 

4. Were there differences between the writing abilities of 
boys and girls? 

The period of the investigation was that of December 1, 
1938, to January 16, 1939. During this period five books were 
dramatized and broadcast : 


Boys’ Life of Colonel Lawrence .....Thomas ........... December 1 














Story of a Bad Boy Aldrich December 8 
Little Women IIE» Vnissonieo December 15 
OV’ Paul Rounds ........... January 5 
Waterless Mountain Armee ............ January 12 


Changes in reading interests were measured by a reading 
interest questionnaire answered by students before and after 
the experimental period; amounts of reading by a reading 
record kept by students during the period; and changes in 
writing ability by compositions written by students before and 
after the experimental period. The twelve experimental or 
radio classes were matched roughly with twelve control or non- 
radio classes on the bases of grade, intelligence, and socio- 
economic background. 


READING INTERESTS 

The reading interest questionnaire consisted of seventy 
items, i.e., seventy kinds of stories, and students were asked to 
check those kinds of stories that they liked to read. It was 
assumed that their responses to the items on this question- 
naire constituted a valid index of their reading interests, that 











READING, WRITING, AND RADIO 705 


changes in the number of reading interests could be deter- 
mined by comparing the total number of ‘‘likes’’ expressed on 
the questionnaire, given once before and once after the experi- 
mental period. Moreover, the questionnaire was constructed 
so that students’ interest in reading specific kinds of stories 
could be determined as well as the total number of their read- 
ing interests. These kinds of stories were included: 


Animals Mystery 

Boys War 

Girls Love 
Mythology Adventure 
Pioneers Sea 

Sports Famous People 
Other Countries Fantasy 


Here again it was assumed that changes in students’ interest 
in reading these kinds of stories could be determined by com- 
paring the number of ‘‘likes’’ expressed on the questionnaire, 
given once before and once after the experimental period. 
First, a comparison of the expressed reading interests of boys 
and girls shows few similarities and many differences. The 
boys expressed more reading interests (‘‘likes’’) than the 
girls, the averages being 36.88 and 31.87, respectively. This 
difference between boys and girls was somewhat unexpected 
since the assumption has commonly been made that girls have 
more reading interests than boys. To generalize from these 
data, however, that the boys were more interested in reading 
than were the girls would be going beyond the facts. Interest 
in reading is not necessarily the same as number of reading 
interests, and the girls may have had fewer, but more intense, 
reading interests. They may have been interested in fewer 
kinds of stories, but so intensely interested that they read 
more than the boys. They may have been more interested in 
reading as measured by the amount of their reading at the 
same time that they had fewer reading interests. Neverthe- 
less, the data indicate clearly that the boys had more professed 
reading interests than did the girls. This conclusion is based, 
of course, upon the assumption that the reading interest ques- 
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tionnaire which was used included all of the more important 
reading interests of seventh- and eighth- grade students, plus 
the usual assumptions involved in using the questionnaire 
technique. 

Second, not only did the boys and girls differ in the number 
of their reading interests but also in the types of stories that 
they preferred. In Table 1 fourteen kinds of stories—ani- 
mals, boys, girls, mythology, fantasy, pioneers, sports, other 
countries, mystery, war, love, adventure, sea, famous people— 
are ranked in the order of the mean number of ‘“‘likes’’ ex- 
pressed for stories in each category. A study of this table 
shows that the reading interests of these boys and girls were 
decidedly different in the first place. 


TABLE 1 


Reading Preferences of 987 Chicago Boys and Girls as Indicated by the 
Average Number of Their Interests in Different Kinds of Stories 



































Boys Girls 
Mean number Mean number 
Stories of of reading Stories of of reading 
interests interests 
III cccnccitndadioune 3.91 Girls 3.68 
War 3.89 ST 3.63 
|, See anene Renee 3.87 OEE id lites 2.70 
AAVEMEUTE on. ceeccsssessrssen 3.65 Love 2.66 
Mythology ncccescrcensenen 3.19 DIIIR . ciccnctetttering 2.33 
Sea 3.17 Famous People ............. 2.21 
REE pencticcmrierenn 2.94 I meetin 2.20 
DOORS xia 2.93 AAVONEUTE  neceeeccrsnsereen 2.14 
BONGO ciceerissisichtcineninnnanone 2.59 Mythology -nnncccovscrencseneen 1.95 
RD cits 2.48 Boys 1.84 
Famous People .......... 1.96 Sea 1.73 
Other Countries .......... 1.57 Other Countries .......... 1.65 
Love 54 War 1.62 
Girls 33 RRR AE . 1.55 








The average number of interests in the different categories 
is entirely different—with two exceptions. Both the boys and 
girls expressed a strong interest in mystery stories, the mean 
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number of interests of the boys being 3.67, that of the girls 
3.63 ; both the boys and girls expressed little interest in stories 
of other countries, the mean number of interests of the boys 
being 1.57, that of the girls 1.65. Mystery stories was the only 
eategory of reading interests in which both the boys and girls 
indicated interest; stories of other countries the only one in 
which they both expressed comparatively little interest. 
Otherwise, there was little agreement even upon those stories 
that they did not like so well. Boys indicated a strong interest 
in stories of pioneers and of war; the girls showed little inter- 
est in both, particularly in war stories. The girls showed in- 
terest in reading stories about girls and the boys were defi- 
nitely not interested in such stories. Other differences can 
easily be seen from the table: the boys showed more interest 
than the girls in stories of adventure, mythology, the sea, 
sports, animals, and boys, but less interest in stories of love, 
famous people, and other countries. 

Third, the students who listened to the five radio dramatiza- 
tions of stories did not change significantly in the total num- 
ber of their expressed reading interests or in the number of 
their interests in various kinds of stories as compared with 
those pupils who did not hear the programs. There was no 
statistically significant difference between the mean gains in 
the number of reading interests of the radio and the control 
classes. Likewise, there was no statistically significant differ- 
ence between the mean gains of the radio and control classes 
for any of the four sub-groups: boys, girls, seventh-grade stu- 
dents, eighth-grade students. Actually, there was little change 
in either group. The average of the mean gains of the radio 
classes was —.17, that of the control classes only .21. More- 
over, the mean changes in both radio and control classes were 
negative as well as positive. In six radio classes and in six 
control classes, the average number of reading interests ex- 
pressed by the students was less at the end of the experimental 


1In this comparison—and in others in this experiment—R. A. Fisher’s 
t test was used to determine the statistical significance of differences in 
means, and his five per cent level of significance was accepted. 
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period than at the beginning. Although the average of the 
mean gains of these classes was close to zero, there were large 
variations in individual classes. As a matter of fact, the 
largest mean gain was made by a control class and the greatest 
loss in the number of reading interests was made by a radio 
class. It seems quite evident that this series of literature pro- 
grams did not increase the number of reading interests of the 
boys and girls who listened to them. 


AMOUNT OF READING 


The amount of reading done by students in the radio and 
control groups was determined from a reading record kept by 
each student of the books he read during the experimental 
period. It was assumed that the amount of reading done by 
students could be estimated by tabulating the number of books 
that they reported reading during the experimental period. 
It was assumed, too, that the number of students who read the 
specific books that were broadcast could be determined from 
these reading records. 

First, the girls in the radio and control classes read more 
books than the boys even though they expressed fewer reading 
interests than the boys on the reading interest questionnaire. 
The average number of books read by the girls was 8.57, that 
of the boys 6.86. The fact that the boys expressed more read- 
ing interests than the girls but read fewer books is another 
indication that reading interests are not necessarily measured 
by the number of books read. It seems quite apparent that 
the number of books one reads is definitely limited by the 
availability, accessibility, and readability of books. One may 
read on a subject in which he has relatively little interest 
merely because certain books and magazines are readily acces- 
sible, easily read, drawn to his attention by skillful advertis- 
ing, or required by the teacher. One may fail to read books 
and magazines dealing with subjects in which he is intensely 
interested simply because the materials are not available, 
accessible, or readable. 
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Second, the students who listened to the five radio programs 
read more books than those students who did not hear the pro- 
grams. There was a statistically significant difference be- 
tween the average number of books read by students in the 
radio classes and by those in the control classes, although indi- 
vidual classes varied widely. The average of the means of the 
radio classes was 9.32, that of the control classes 6.32. Like- 
wise, there was a statistically significant difference between 
the average number of books read by the boys in the radio and 
control classes, by the girls in the radio and control classes, 
and by the seventh-grade students in the radio and control 
classes. There was not a statistically significant difference 
between the average number of books read by the eighth-grade 
students in the radio and control classes. 

Third, these students who heard the radio dramatizations of 
books read some of those books, but not others. Little Women 
was reported on their reading records by 116 students in these 
radio classes, Story of a Bad Boy by 63 students. On the 
other hand, Waterless Mountain was read by only ten of the 
students, Boys’ Life of Colonel Lawrence by only two, and Ol’ 
Paul by no one. 

There seem to be several possible explanations for the fact 
that the students read some of the books which Were broadcast 
but not others. One is that they read those books whose titles 
were already familiar to them and that this series of radio 
programs was ineffective in stimulating them to read books 
that were relatively unfamiliar to them. Another is that the 
teachers placed much greater stress upon the reading of Little 
Women and the Story of a Bad Boy than they did upon the 
reading of the other stories. A third possible explanation is 
that there were only a few copies of Waterless Mountain, Boys’ 
Life of Colonel Lawrence and Ol’ Paul available in contrast to 
the many copies of the two more familiar books that are ordin- 
arily available in all school and public libraries. Although 
the data indicate that this series of programs was ineffective 
in stimulating students to read books that were unfamiliar to 
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them, further research in which the factors of availability, 
accessibility, readability, and teacher stimulation are con- 
trolled is necessary to determine the effectiveness of radio 
dramatizations in encouraging children to read specific books. 


WRITING ABILITY 


Evidence of changes in pupils’ ability to write correct and 
effective English was obtained by a comparison of two of their 
compositions, one written on ‘‘Life in 1988’ following the 
broadcast on November 10, 1938, of Twenty Thousand Leagues 
Under the Sea, the other written on ‘‘ Exploring in 3939’’ fol- 
lowing the dramatized broadcast of Waterless Mountain on 
January 12, 1939. These compositions were rated by a compe- 
tent jury on a seven-point scale (the best compositions were 
designated as ‘‘1’’; the poorest ones as ‘‘7’’) using the two 
general criteria of content and form. The following items 
were considered in judging content: number of ideas, develop- 
ment of these ideas, specificity, consistency, and originality ; 
and in judging form these two: coherence and style. 

First, the girls in these radio and control classes demon- 
strated on one composition, ‘‘Life in 1988,’’ greater writing 
ability than the boys. The mean score of the girls was 4.54, 
that of the boys 4.79. 

Second, there was a statistically significant difference be- 
tween the average gains made in composition ability by the 
students in the radio and by those in the control classes. Like- 
wise, there was a statistically significant difference between 
the average gains made by the boys in the radio and control 
classes. But there was no statistically significant difference 
between the average gains made in composition ability by the 
girls in the radio and control classes, by the seventh-grade 
students in the radio and control classes, or by the eighth- 
grade students in the radio and control classes. The average 
of the mean gains of the radio classes was .35, that of the 
control classes —.43. Six radio classes showed a gain in means, 
six a loss, but only two control classes showed a gain in means. 














READING, WRITING, AND RADIO 711 


Actually, of course, the radio classes did not gain to any extent 
but rather they remained fairly constant while the control 
classes were regressing. 


CONCLUSIONS AND RECOMMENDATIONS 


The following conclusions and recommendations seem war- 
ranted from a study of the data in this experiment: 

Firsi, this series of radio programs—plus the teacher’s 
utilization of them—failed to increase the number of expressed 
reading interests of these seventh- and eighth-grade boys and 
girls, both in the total number of interests and in the interests 
expressed in different kinds of stories. It is, perhaps, overly 
optimistic to expect that fifteen-minute broadcasts once a week 
for six weeks can change to any extent the reading interests 
of children, interests which have been formed by children’s 
previous experiences in school, by their home and community 
relationships, by the pressures of groups of their own age. 
Perhaps the experimental period of six weeks was too short a 
time in which to expect changes. At any rate, there was no 
ehange in the number of their reading interests as a result of 
their listening to these programs. Further research is neces- 
sary in order to determine the effectiveness of radio programs 
in increasing the number of children’s reading interests. 

Second, this series of radio programs—plus the teacher’s 
utilization of them—did stimulate these seventh- and eighth- 
grade boys and girls to read more books. The fact that the 
students who listened to the programs reported reading more 
books than those who did not hear the broadcasts—even 
though they did not change significantly in the number of 
their reading interests—indicates that radio programs can be 
used as a stimulant to reading. Obviously, this stimulant may 
have been in the broadcasts themselves or in the teachers’ 
classroom techniques before and after the broadcasts. In 
either case, however, it is evident that radio dramatizations 
ean be used effectively by teachers who want their students to 
read more books. 
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Third, this series of radio programs—plus the teachers’ 
utilization of them—failed to stimulate these students’ read- 
ing of some of the particular books dramatized and broadcast. 
Only two of the five books were read to any extent by those 
students who heard the programs, and those two books were 
both well known and ‘‘accepted’’ as children’s favorites. The 
other three books—none of them familiar children’s books— 
simply were not read either before, during, or after the radio 
dramatizations. Obviously, one reason for this failure of the 
students to read some of the books that were broadcast may 
have been that the books were not available or accessible to 
many pupils. Another reason may have been that the teachers 
themselves focussed attention upon reading those books that 
are commonly ‘‘accepted’’ as children’s favorites. A third 
reason may be that fifteen-minute weekly radio dramatiza- 
tions are ineffective in encouraging students to read books that 
are unfamiliar to them. Further research, in which the fac- 
tors of availability, accessibility, and readability are controlled 
as well as the more common stimulants of teachers, parents, 
and movies, is necessary in order to determine the effective- 
ness of radio programs in encouraging children to read spe- 
cific books. 

Fourth, this series of radio programs—plus the teachers’ 
utilization of them—did stimulate these students to write more 
effective English. Here, however, a reservation must be made 
as to the scope of this conclusion: it is valid when all students 
in the radio classes in both grades are compared with all stu- 
dents in the control classes in both grades; it is valid when all 
the, boys in the radio classes are compared with all the boys in 
the control classes. It is not valid in the comparison of the 
girls in the radio and control classes or of the seventh-grade 
students in the radio and control classes or of the eighth-grade 
students in the radio and control classes. The evidence, then, 
is inconclusive and leads to a tentative conclusion that effective 
utilization of radio programs by the teachers did stimulate 
some students to write more interestingly and more effectively. 
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Fifth, there were differences between boys and girls in the 
number of reading interests, in the number of books read, and 
in writing ability. The boys expressed more interests in read- 
ing but read fewer books than the girls and were inferior to 
the girls in writing ability. The fact that the boys expressed 
more reading interests but read fewer books suggests that 
teachers and librarians should study these interests and pro- 
vide sufficient readable books of various types to satisfy these 
interests of boys. The fact that the girls expressed fewer 
interests than the boys but read more books suggests that 
teachers and librarians should question the common assump- 
tion that girls have more reading interests and turn their 
attention to increasing and broadening the reading interests 
of girls. 

Sixth, there were decided differences between the story pref- 
erences of these boys and girls. The boys expressed the larg- 
est number of reading interests in stories of pioneers, war, 
mystery, and adventure; the girls in stories about girls and in 
mystery stories. These preferences suggest that if the pur- 
pose of the broadcaster is that of giving boys and girls of the 
seventh and eighth grades an interesting and enjoyable radio 
dramatization, he should choose stories of these,types. More- 
over, the fact that the students in both grades expressed com- 
paratively little interest in stories of famous people and in 
stories of other countries suggests that the broadcaster should 
avoid stories of these types, both of which are commonly used 
in school broadeasts—again if his purpose is that of giving 
pupils radio dramatizations of stories which they now enjoy. 














RADIO AND ELEMENTARY SCIENCE 
TEACHING 


J. ROBERT MILES 


Research Associate, Evaluation of School Broadcasts, 
Ohio State University, Columbus, Ohio 


HE increasing use of radio programs to supplement 
i classroom teaching has indicated a need for evidence 
of their effectiveness. The Evaluation of School 
Broadcasts’ research project at Ohio State University has, 
for three years, been cooperating with local and network 
broadcasters in appraising the value of such programs. One 
opportunity to evaluate a series of school broadcasts arose in 
September, 1939, when the Radio Council of the Chicago 
Public Schools invited members of the ESB Staff to assist 
in determining the effectiveness of their elementary science 
series ‘‘Your Science Story Telier.’’ 


SETTING UP THE EXPERIMENT 


The content of this series was determined by a committee 
of 5th- and 6th-grade science teachers who were familiar 
with the newly-revised course of study which these programs 
were to supplement. The objectives of both this course of 
study and the radio programs were: 

1. To increase student knowledge about problems in the 

conservation of wildlife and natural resources. 

2. To develop student attitudes favorable to the conserva- 

tion of wildlife and natural resources. 

3. To extend student interest in the conservation of wild- 

life and natural resources. 
These science teachers, in a conference with representatives 
of the Radio Council and the ESB Staff, agreed to conduct 

1 Hereafter designated as ESB. 
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a controlled experiment based on these objectives comparing 
the semester progress of ten 5th- and 6th-grade science classes 
hearing the program with that of ten matched classes not 
hearing it, 

Ten pairs of classes—five from the 5th grade and five from 
the 6th—were selected with each pair approximately alike in 
intelligence, reading ability, age and sex. These classes were 
located in schools representing all sections of the city as well 
as different nationalities and economic strata. 


CONSTRUCTING THE TEST 


The comparison of the progress of the radio and non-radio 
classes was based on a testing instrument constructed by 
members of the Radio Council and ESB Staff in collabora- 
tion with a committee of the cooperating teachers. In the 
first section of this test, the information and attitudes of 
students were indicated by their reactions to five conserva- 
tion problems. Students reacted to each problem by agree- 
ing, disagreeing or indicating uncertainty about fifteen state- 
ments of fact or opinion. In the second section of the test, 
students indicated their interest in various aspects of con- 
servation by responding to three groups of ten statements 
each. They were asked to check any statements in the first 
group about which they would ‘‘like to know,’’ any state- 
ments in the second group about which they would ‘‘like to 
find out for themselves,’’ and any statements in the third 
group about which they would ‘‘like to report to their class.’’ 

After revisions suggested by teachers and ESB Staff mem- 
bers, the test was administered to the twenty classes on Octo- 
ber 26, 1939. It was again administered to these classes on 
January 25, 1940, after the ten radio classes had utilized the 
thirteen programs in the series. On the later date, teachers 
of the radio classes submitted reports indicating their prepa- 
ration and follow-up activities, the amount of time spent in 
such utilization practices, and the student activities resulting 
from the broadcasts. All teachers reported the sex, grade 
and IQ of their students. 
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SCORING THE TESTS 


The 75 student responses on the first section of the test, 
of which 38 were on information and 37 on attitudes, were 
scored by machine, according to a key developed by the com- 
mittee that constructed the test. A total score for each 
variable was obtained for each student. 

The responses to the thirty interest items were also scored 
by machine but the three levels of interests ‘‘to know,’’ ‘‘to 
find out for myself,’’ and ‘‘to report to my class’’ were 
weighted 1, 2, and 3 respectively before adding to get a total 
score. Thus the maximum ‘“‘interest’’ score for the 30 items 
was 60 (10x1+10x2+10x3). This method of scoring the 
section on interests resulted from previous exploratory re- 
search at Cicero, Illinois, in which the two meanings of in- 
terest, namely, ‘‘depth’’ of interest and ‘‘breadth’’ of inter- 
est, were shown to be essential parts of any index of interest. 
That is, both the extent of a student’s responses and the pro- 
portion of responses at the three levels should be included in 
his interest score. Thus the student who indicated interest 
in finding out for himself about the ten items in the second 
group was assumed to be more ‘‘deeply’’ interested than the 
student who indicated interest in knowing about the ten 
items in the first group. Total scores were weighted to show 
this difference. 


VALIDITY AND RELIABILITY 


The validity of the test can only be judged in terms of the 
qualifications and procedures of the committee that con- 
structed it. The Chicago teachers who cooperated in the 
construction of the test had previously participated in the 
reorganization of the 5th- and 6th-grade course of study in 
science. They had also cooperated with the representatives 
of the Radio Council in outlining the objectives and the con- 
tent of the supplementary radio series. As participants on 
the committee, these teachers were qualified to judge whether 
the test measured the factual content and the student be- 
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havior with which both the course of study and the radio series 
were concerned. The incorporation of the suggestions of the 
various committee members is the only evidence of the validity 
of the test. 

The reliability of the student responses was checked on 
each variable (information, attitudes, interest) of the test 
by the split-half method employing the Spearman-Brown 
correction formula. Previous to this, frequency distribu- 
tions of the responses of boys and girls in the 5th and 6th 
grades were made. These showed a close approximation to 
a ‘‘normal’’ distribution (in all cases) on each of the vari- 
ables. The reliability coefficients from a sample of 70 stu- 
dent (one fifth- and one sixth-grade class) responses on the 
pre-test were, on information, .67, on attitudes .83, and on 
interests .92. Such reliability was considered satisfactory 
for the group interpretations which were made. 


INTERPRETING THE TEST SCORES* 


After tabulating the pre- and post-test scores for the 651 
students and noting the ‘‘normal’’ distribution and the wide 
range of the response, an analysis of variance? between and 
within the radio and non-radio class was made, In so doing, 
the pre- to post-test means gains or losses of the radio classes 
were checked as to statistical significance. It should be men- 
tioned that in this report ‘‘significance’’ indicates that only 
those gains or differences in gains which would have happened 
five, or less than five times in a 100 by chance were reported 
as significant. All other gains or difference in gains were re- 

1 Summary tables and graphs of the original report on this study 
(ESB Bulletin 14) are available at the offices of Evaluation of School 
Broadcasts, Ohio State University, Columbus, Ohio. 

2 The statistical technique known as the ‘‘ Analysis of Variance’’ com- 
pared the variation within the radio and non-radio classes with the dif- 
ferences between the two groups. The technique employed in this experi- 
ment was suggested by a method of Neyman and Johnson. For further 


reference see Lindquist ‘‘Statistical Analysis in Educational Research’’ 
Chapter V. 
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ported as insufficient evidence for denying the hypothesis that 
‘‘no gain or difference in gains occurred.’’ 


DIFFERENCES IN PRE- AND POST-TEST RADIO CLASS MEANS 


Mean gains were registered in all classes, both radio and 
non-radio, in information and in attitudes. However, both 
gains and losses in ‘‘interest’’ were registered, most noticeably 
in the radio classes. The t-test® of the pre- to post-test mean 
gains for the ten radio classes showed a significant increase in 
both the information and the attitudes keyed as desirable. 
However, the mean gain in interests in these classes was not 
found to be significant. The analysis of variance showed simi- 
larly that the radio group as a whole gained significantly more 
than the non-radio group in both information and desirable 
attitudes but not in interests. 

In the radio classes, the pre- to post-test gains were found to 
be significant only for the 6th grade while the interest gains 
were not statistically significant in either grade. In this 6th- 
grade radio classes, boys made significant gains only in atti- 
tudes, while girls made such gains in both information and 
attitudes. When all 5th- and 6th-grade boys in the radio 
classes were considered, a significant gain in information was 
indicated. The girls in the radio classes of the two grades also 
showed a significant gain in information and in attitudes. 


RADIO VERSUS NON-RADIO 


The differences in mean gains of the radio and non-radio 
classes were examined by grade level and by sex, with students 
matched for five levels of IQ and for three levels of pre-test 
scores. The analysis showed that significant differences in 
pre- to post-test mean gains occurred in the 5th grade only. 
In the 6th grade, the non-radio group made gains sufficiently 
similar to the radio group to cause all differences to be in- 

3 This is a technique for determining whether a significant difference 
exists between the pre-score means and post-score means of the ter classes. 


For further reference see Lindquist ‘‘ Statistical Analysis in Educational 
Research’’ Chapter IV. 
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significant statistically—although it must be remembered that 
the pre- to post-test gains made in the 6th-grade radio classes 
were significant. 

In the 5th grade the radio classes gained significantly more 
in information and attitudes than the non-radio group but 
gained significantly less in interests. The attitude shifts were 
significantly greater for girls than for boys in these 5th-grade 
radio classes. Another significant sex difference was found in 
the 5th-grade control group where the boys made a greater 
gain in interests than the girls. 


UTILIZATION 


At the end of the broadcast series, the ten teachers of radio 
classes reported on various aspects of their utilization prac- 
tices. Considering their several reports and the test results, 
it appeared that student interest in conservation increased in 
proportion to the amount and kind of utilization that occurred. 
Five of the radio teachers reported that their entire utilization 
(both preparation and follow-up) involved less than thirty 
minutes of pupil time. The other five radio teachers reported 
that their utilization involved from forty-five minutes to sixty 
minutes (or more) of pupil time. Most of the second group 
reported both more numerous and more extensive follow-up 
activities, the preparation periods of the two groups being 
much the same. A statistical comparison of the mean scores 
of the classes of these two groups indicated that there was no 
significant difference in class gains in information or attitudes, 
but that the difference in gains on the interest scale signifi- 
cantly favored the longer utilization period. In fact, the five 
classes of the first group each registered losses in interest 
whereas all of the second group registered gains. A compari- 
son of the mean interest score of the classes in the second group 
with the mean interest scores of control classes in the same 
schools showed a significant difference in interest gains favor- 
ing the radio classes. This was not true of the other radio 
classes. 
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While the equal competency of the ten teachers of the radio 
classes is only an assumption, they had all discussed utilization 
during a planning conference with ESB staff members and 
should have been equally well-informed about methods of utili- 
zation. Inasmuch as the radio group as a whole showed no 
greater gain in interests than the non-radio group, it seems 
essential that the above significant relation of time-of-utiliza- 
tion to student interest be recognized in planning school broad- 
casts.* 


CONCLUSIONS 


In summary, the data indicate the following conclusions 
about the effects of the ‘‘Science Story Teller’’ programs on 
students in the 5th and 6th grades of Chicago schools. 

1. The mean scores of the radio classes showed that a sig- 
nificant increase in information and a significant shift in atti- 
tudes occurred during the semester. 

2. In both information and attitudes toward conservation of 
wildlife and natural resources, student progress was signifi- 
cantly greater in the radio than in the non-radio classes. This 
must be qualified by the fact that such differences in progress 
occurred primarily in the 5th grade. 

3. The mean gain in interest in the radio classes indicated 
that the amount of time devoted to utilization was an impor- 
tant factor in the development of student interest in ‘‘conser- 
vation’’ in those classes. 

4. While no significant difference in the increase of interest 
was found between the radio and non-radio groups as a whole, 
the 5th-grade non-radio classes showed a significantly greater 
increase than the radio classes. Here, as in conclusion 3 above, 
it should be noticed that the radio classes which spent the most 
time in utilizing the broadcasts were the only radio classes 
showing a gain in interests. 

4It should be recalled that there was no ‘‘time’’ differential between 


the radio and non-radio ‘‘class hours’’ of science. No estimate ef pos- 
sible differences in out-of-school activities was available. 





THE EFFECT OF RADIO PLUGGING 
ON STUDENTS’ OPINIONS OF 
POPULAR SONGS* 


GERHART WIEBE 


HE question investigated here is: Does extensive broad- 
casting (called plugging) of a song influence students’ 
opinions about the song? 

Twenty-four songs were chosen at random from the inex- 
haustible supply of advance copies which publishers distribute 
to the broadcasting networks. Six songs were used with each 
of the four cooperating groups of subjects. 

The subjects were members of two high school and two 
college classes. The high school students were slightly above 
average high school IQ, of ages 15 to 17, from fairly repre- 
sentative urban homes. The college students were from two 
classes, one in Psychology and one in Sociology. In all, there 
were 136 subjects. 

The four groups of students were assembled and each heard 
six new popular songs, each played twice. A few subjects had 
heard a few of the songs previously. They were asked to rate 
each of the songs on a ten-point scale and also to answer seven 
questions concerning the song. Each group was reconvened 
about four weeks after its first sitting and again about four 
weeks later for a third sitting. Since it was known from an- 
other study that style of performance, and especially of vocal 
performance, strongly influences attitudes towards popular 
songs, a good consistent commercial pianist was used at all 
sittings and only choruses were played. 

* This study was done under the auspices of the Office of Radio Re- 


search. The writer is indebted to Mr. Cuthbert Daniel of that office for 
editorial and statistical assistance. 
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TABLE 1 
Mean Ratings of 24 Popular Songs, Coded for Plugging and for Direction 
of Means from Median Score of the 24 Songs 





Mean Rating 





or 
Un- 


sitting sitting sitting Pgh 





1 Two Sleepy People 5.9 7.3 7.5 
2 Who Blew Out the 

5.1 6.6 5.9 

3 Song from Old Hawaii 2.6 2.5 2.6 

4.7 5.2 4.8 

5 Heart and Soul 8.2 8.4 7.7 
6 Could You Pass in 

Love 7.2 6.3 5.1 


~~ Www Ww 
Ss were & 





7 My Heart is Unem- 
ployed 6.6 ‘ 7.0 

8 London Bridge is Fall- 
ing Down 5.2 ‘ 4.2 
9 You Can’t Be Mine ... 7.1 d 7.9 

10 I’ve Got a Heart Full 
Full of Rhythm ........... * 7.0 J 5.2 
6.1 " 6.5 

12 I Kissed You in a 
Dream 7.6 * 7.2 


q @a ad @ 
wu fw ue 





3 
(47)* 13 How Long Can Love 
Keep Laughing? ........... 6.7 6.3 6.4 
14 Ghandi Dancer ............ * 8.4 7.6 7.0 
15 You Never Know 7.7 7.3 6.8 
16 After Looking at You 7.4 7.3 8.1 
17 Tee Um Tee Um Tahiti 5.7 4.4 3.8 
18 Beautiful Danube, No 
Wonder You’re Blue... 5.8 4.7 4.9 
4@ 
(26)* 19 No Wonder 2..ccccccccconnn 6.4 6.7 7.0 
20 My Heaven in the 
Pines 6.2 5.8 6.2 
21 I Won’t Tell a Soul... 7.8 7.8 7.7 
22 How Can We Be 





Yq ww a awwad 
Be Fr Fr Pwo 


omitted because plugged before first 
sitting 

23 Blue Interlude ............. 4.4 3.7 4.2 U L 
8.5 8.3 7.0 U B 


* Number of subjects in each group. 
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The magazine Variety lists each week those songs which 
have been most broadcast. Playing of new popular songs over 
the air is called plugging and those songs which appeared once 
or more in Variety’s lists of most broadcast songs during the 
first month of this study were classified as Much-Plugged (P). 
All other songs are called Little-Plugged or Un-Plugged (U). 
It may be mentioned in passing that changing the criterion 
for ‘‘Plugged’’ from one appearance in Variety’s list to three, 
only removes one song (Number 3, ‘‘A Song from Old 
Hawaii’’) from the plugged group. 

The mean ten-point scale rating of each song at each sitting 
is given in Table 1. 

The songs are divided crassly, in the sixth column of Table 1, 
into plugged and unplugged songs according to the criterion 
of appearance or non-appearance in Variety’s lists. But vari- 
ations in the plugging of these songs are extensive both in num- 
ber of plugs in one week and in number of weeks of appear- 
ance in the lists. For example, ‘‘Song from Old Hawaii’’ 
appeared only twice in Variety’s list, being played in two con- 


secutive weeks eleven and ten times; ‘‘ All Ashore’’ at the other 
extreme appeared for seventeen consecutive weeks and was 
played over 400 times. It is interesting that once dropped 
from Variety’s list, no song re-appeared later. Table 2 shows 
the total number of playings of the plugged songs for each 
week of the experimental period. Week Number 1 is the week 
of the first sitting of the first group to sit. 


TABLE 2 
Number of Plugs per Week 





Week Number a ae | ef 





Number of plugs 131 137 153 129 152 136 169 156 169 135 140 123 
Total 1730 








Although several of these songs had been plugged for two 
or three weeks before the first sitting (the number of plugs 
before was 282, or 14 per cent of the total number of plugs) 
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the subjects, except for a few individuals, reported that the 
songs were new to them at the time of the first sitting. A 
fortunate though fortuitous fact is the comparative constancy 
of plugging of the songs throughout the experimental period. 
This constancy entitles us to investigate the significance of 
differences between sittings 1 and 2 as well as over-all differ- 
ences, between sittings 1 and 3. A Chi-square calculation 
shows that the number of plugs is essentially constant through- 
out the period of the experiment. 

Table 3 shows the mean rating at each sitting for the ten 
plugged and the twelve unplugged songs. 


TABLE 3 
Ratings for Plugged and Unplugged Songs, by Sittings 











Sitting 
1 2 3 
SN ERS eee 6.5 6.6 6.4 
OUT 6.4 5.9 5.9 





Statistical test’ shows that the only significant difference (at 
the ‘‘0.05 level’’) is that between the first and second sittings 
for unplugged songs. This state of affairs may be interpreted 
as follows. 

Plugging maintains students’ ratings of songs at a constant 
level. Songs which are not plugged are rated lower when they 
are heard again. 

It appears, then, that extensive plugging does not increase 
the liking of students for popular songs. But if songs are 
not plugged, they fall off in rating, and the plugging can be 
expected to hold the rating up to a level of roughly 6.5 on a 
ten point scale. Due to the small number of songs used in this 
study the constancy of rating of the plugged songs can only 
be asserted with an accuracy of about 0.5 of scale unit. The 


1 Student ’s t-function used as described by R. A. Fisher in ‘‘ Statistical 
Methods for Research Workers’’ IV Edition, p. 113, for correlated meas- 
urements. 
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expected drop in rating of unplugged songs is only just over 
this critical limit. 

The effect of plugging might be expected to be different for 
songs originally better liked and for those originally less liked. 
To test this hypothesis, the plugged songs were divided into 
two groups, those (5) above their median score (6.67) and 
those (5) below this score. Similarly the unplugged songs 
were divided into two sub-groups, above (7) and below (6) 
their median (6.28). The above-median sub-groups are called 
better liked and are labelled B in column 7 of Table 1. The 
below-median sub-groups are called less liked and labelled L. 

Table 4 shows the mean ratings for the four sub-groups. 


TABLE 4 
Ratings of Liked and Plugged Songs, by Sittings 





Sitting 





2 





Better Liked Songs 
Plugged 7.7 7.4 7.1 
Unplugged 7.4 ? 72 6.8 
Less Liked Songs 
Plugged 4.9 5.7 5.6 
Unplugged 5.6 4.9 5.0 





The only statistically significant difference between sittings 
in this table is that between the first and second sittings for less 
liked unplugged songs. But the general trend of the four 
groups must also be viewed as significant. It is seen that those 
songs that were initially better liked fall off slightly in rating 
whether or not they are plugged. It is in the less liked groups 
that the importance of plugging appears, for here the trend 
is upward for plugged and downward for unplugged songs. 
It should be possible to study the effects of desire-to-be-fashion- 
able, of musical competence, and of ability to play the piano. 

Table 4 justifies the following contention: 

Plugging does not affect students’ ratings of better liked 
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songs. Plugged and unplugged songs fall off together at the 
rate of about 0.3 of a scale unit for each sitting. 

Plugging does affect the rating of songs originally less well 
liked. Such songs are rated about 0.3 of a scale unit higher 
for each successive sitting. 

We have made, then, one differentiation with respect to a 
property of the songs, namely better liked and less liked; and 
one distinction with respect to an outside influence, namely 
much plugging and little plugging. It remains to investigate 
differences between the subjects. The important difference 
here is taken to be that between those who are more and those 
who are less interested in popular music. As an index of in- 
terest the subjects were divided into those who listened to the 
radio more than two hours a day, and those who listened less 
than two hoursa day. Table 5 shows the difference in ratings 
for these two groups. 











TABLE 5 
Ratings of Songs by Those Who Listen More, by Sittings 
Sitting 
1 2 3 

More daily listening 

Plugged ............. 7.2 7.6 7.2 

Unplugged ........ 6.6 6.1 5.9 
Less daily listening 

Plugged ............ " 6.5 6.9 6.7 

Unplugged. ....... 6.1 5.3 5.5 





In the first place, those who listen more to the radio rate the 
songs consistently higher for every sitting and whether or not 
the songs are plugged. This measures the greater enthusiasm 
of the habitual listeners. The difference in rating is about 0.6 
of a scale division for each sub-group. Secondly, the plugging 
of songs is no more effective with those who listen more than 
with those who listen less. The differences in rating as be- 
tween plugged and unplugged songs are almost exactly the 
same for those who listen more (0.6, 1.5, and 1.3) and for those 
who listen less (0.4, 1.6, and 1.2). 








STUDENTS’ OPINIONS OF POPULAR SONGS 727 


SUMMARY 


One hundred thirty-four high school and college students 
liked ten popular songs that were later widely broadcast, no 
better than thirteen songs that were seldom or never broad- 
east. After a month, however, the plugged songs were liked 
as well as before, while the unplugged songs were rated some- 
what lower. After a second month, there was no change in 
rating of plugged or unplugged songs. Dividing the songs 
into those that are initially well-liked and less well-liked, it is 
found that plugging does not affect the ratings of the more- 
liked songs. Plugged and unplugged songs fall off in rating 
together. Plugging does affect the rating of songs originally 
less well-liked. These songs, if plugged, increase slightly in 
rating at each sitting. 

There was no observable difference in effectiveness of 
plugging for radio-enthusiasts as against casual listeners. 




















III. Program Research 
THE ‘‘PROGRAM ANALYZER”’ 


A NEW TECHNIQUE IN STUDYING LIKED AND 
DISLIKED ITEMS IN RADIO PROGRAMS 


JACK N. PETERMAN 
Office of Radio Research, Columbia University 


N investigating the radio listeners’ reactions to a program, 
it is often desirable to know their attitude to specific items 
in that program. Does the variety show listener like the 

‘*gags,’’ the skits, the music, or perhaps the voice of one of 
the actors? And why? To what extent is one radio com- 
mercial more disliked than another, and for what specific 
reasons? Which parts of an educational program that fea- 
tures dramatizations and commentators are most liked (or 
disliked) by radio audiences, and what are the reasons for 
such reactions? 

If the radio listener is asked for an answer to these questions 
after having heard a given program, it is too frequently found 
that he just doesn’t remember, or that the more recent memory 
of the later parts of the program influences his recollection of 
the earlier parts. If, on the other hand, an attempt is made 
to get his reaction during the program, one encounters the 
difficulty that the stimulus situation is no longer ‘‘normal.’’ 
The initial listening experience has been disrupted. This 
apparent impasse can be largely eliminated by the utilization 
of a newly developed mechanism that, without interfering 
with or disrupting the program, records the likes and dislikes 
of people, while they are listening to it. This reaction- 
recording machine and the technique of its use were conceived 
and developed by Paul F. Lazarsfeld and Frank Stanton. 
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The ‘‘ Program Analyzer’”’ is essentially a modified polygraph 
in which a paper tape moves continuously under eleven paired 
sets of pens. Each of these pens rests on an individual sup- 
port that jogs it when a magnet is energized by the pressing 
of a push-button. Of each pair of pens, one makes a red 
record line and is activated by a push button having a dis- 
tinctive red top; the other pen leaves a black record line and 
is activated by a button with a green top. In addition, one 
extra pen is connected to an electric second-timer which marks 
a time line on one edge of the continuously moving tape. 

In operation, one pair of buttons (i.e., those that activate 
a paired set of pens) is given to each of the subjects. There 
being eleven such pairs, it is possible to test as many as eleven 
subjects simultaneously. The subjects are instructed as fol- 
lows : 

**Please take the push button with the green top in 
your right hand and the one with the red top in your left 
hand. We will play back to you an electrical transcrip- 
tion of an actual program that was recently broadcast 
over the air. We should like you to indicate how you like 
this program while you are listening to it. If, during 
any part of the program, you feel that for any reason 
whatever you like what you are listening to, please press 
the green button in your right hand—green means ‘go.’ 
If, for any reason you dislike what you are listening to, 
press the red button in your left hand—red means ‘stop.’ 
If any part leaves you indifferent, just do not press either 
button. To repeat then: like—right hand, green button; 
dislike—left hand, red button; indifference—no hands, 
no button. At no time should you press both buttons. 
And please don’t jiggle the buttons but keep pressing the 
appropriate button for as long as the part you like or dis- 
like continues.’”* 

After these instructions, the subjects are given a short prac- 
tice period during which their comprehension of the procedure 
is checked. This is immediately followed by the test proper. 

At the completion of the playing and mechanical judging 

1 These instructions, in diagrammatic form, are also placed on a black- 


board in front of the subjects in order to minimize the possibility of error 
or misunderstanding. 
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of the program, the tape is removed from the ‘‘ Analyzer’’ and 
the parts marked as ‘‘liked’’ or ‘‘disliked’’ by each of the 
subjects are located. Using the time line on the ‘‘ Analyzer’’ 
tape as a means of identifying the parts of the program in- 
volved, each such part is located on the program recording? 
and is played back to the subjects. The subject, with that 
portion of the program fresh in mind, is then interviewed and 
asked just what it was that had made him like or dislike that 
particular part when he first heard it. 

By summating the ‘‘likes’’ and ‘‘dislikes’’ of a number of 
subjects, it is possible to obtain quantitative indications as to 
the extent to which the program is liked or disliked by the 
whole group of subjects. 

A study made of the Public Affairs Weekly Program of May 
22, 1940, can be taken as an example of the way the technique 
described may be applied. The program dealt with the 
abridgment of civil liberties in the United States during times 
of crisis. It presented, by means of monologues, dramatiza- 
tions, and narrators, a series of situations in which the freedom 
of the press and the right of free speech were interfered with 
in the United States. Thus, after the introductory announce- 
ments, the program proceeded as follows: 


Time Medium Content 
28”-31” Organ Somber music 
317-45” Narrator ‘*We talk of freedom now... 
Life, liberty and the pursuit 
of happiness.’’ 
45”-1/ 2” Chorus ** Life, liberty and the pursuit of 
happiness ...’’ 
1’ 2”—1/ 4” Organ Somber chords 
1’ 4”—-1/13” TownCrier Decree restricting religious free- 
dom 
1’ 13-1 16” Organ Somber chords 
1’ 16’-1’ 29” TownCrier Decree prohibiting criticism of the 
king 
1’ 29”-1’31” Organ Loud chord 
— ete. — 


2 The location of specific items is made possible by the use of a gauge, 
calibrated in terms of minutes and seconds, which is placed radially across 
the acetate disk on which the program is electrically recorded. 
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Altogether there were 45 such parts to the program. Each 
part, with the exception of the choruses, was a fairly distinct 
unit. The choruses, where they occurred, served as bridges 
or links joining the two parts between which each of them 
was placed. 

Structurally, the broadcast contained : 

Announcer parts 
Speaking-chorus parts 

Town Crier parts 
Dramatizations 

Emotional commentator parts 
11 Narrator parts 

11 Organ parts 

6 Monologue parts 


As to subject matter, it chiefly sought to emphasize that at 
times such as these, when war hysteria is on the increase, the 
public must be especially cautious if our civil liberties are to 
be retained. 

The program was tested on a total of 52 subjects who had 
come to the studios of Station WOR in order to participate in 
a Radio Listeners’ Conference. The group was made up of 
47 females and 5 males. The ages’of the women ranged from 
18 to 65 (mean : 41.2) ; those of the men, from 17 to 38 (mean: 
26.2).° 

The pattern of responses of the individual subjects varied 
considerably. Seventeen of the listeners alternately indicated 
all three possible reactions during the course of the program, 
liking some parts, disliking others, and being indifferent to 
the rest; eighteen subjects gave only responses of liking 
or indifference; and eleven only indicated either dislike or 
indifference. Six listeners remained indifferent throughout. 

The individual listeners also differed considerably in the 
amount of the program which they liked or disliked. At one 
extreme one listener indicated a dislike for 94 per cent of the 
program. At the other extreme one subject indicated a liking 


8 No information was obtainable concerning their socio-economic or edu- 
cational level, but in general they seemed to belong to that ‘‘average’’ 
urban group which has had at least a grammar or high school education 
and lives on a slightly less than $2,000 a year level. 
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for 75 per cent of this same program. On the average, these 
52 listeners liked 15.1 per cent, disliked 5.5 per cent, and were 
indifferent to 79.4 per cent of the total program. This large 
percentage of indifference is traceable to the fact that 17 (or 
a third) of the 52 subjects were indifferent to 95 or more per 
cent of the program. 

The individual responses differed in still another way. The 
length or duration of each separate like or dislike response 
varied from one second to 793 seconds in the case of dislikes 
and from one to 510 seconds in the case of likes. The median 
length of the periods of liking was 15 seconds; that of the dis- 
likes, 10 seconds. 

In spite of all these individual variations the reactions of 
the 52 subjects show a clear pattern when they are summated 
as in the accompanying figure: An examination of this histo- 
gram shows that some parts of the program were distinctly 
more liked than others, and similarly that some parts were 
more markedly disliked. There are five points at which the 
peaks in the solid upper line rise considerably above the level 
of the ‘‘likes’’ for the rest of the program, and five points at 
which the dips in the dashed lower line fall markedly below 
the general level of the ‘‘dislikes.’’ A listing of the character- 
istics of the program, as well as the subjects’ responses at these 
points, proves revealing. 

The first thing that stands out is that the five points of 
maximal liking (B, C, E, J and K) all occur when the ‘‘Nar- 
rator’’ is speaking. About five times as many listeners indi- 
cated ‘‘likes’’ at these parts of the program as had indicated 
“6 dislikes. ’’ 

No such near unanimity of opinion is to be found for the 
parts of the program that were disliked. The first and most 
disliked point (A) occurs where a loud chord is struck on the 
organ following the reading of a decree by the Town Crier. 
The second, as well as fourth and fifth points of maximal dis- 
like (D, G and H) occur when the Chorus is speaking. Point 
‘*F’”’ comes at a dramatization in the program where the break- 
ing up of an open air meeting by the police is portrayed. 
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An analysis of the comments made at these points by the 
subjects plus a further examination of the characteristics of 
the ‘‘like’’ and ‘‘dislike’’ reactions throw some light on why 
these parts of the program were thus liked or disliked. 

At the five ‘‘Narrator’’ parts which were most liked the 
comments made referred both to the manner of presentation 
as well as to the material presented. Those who liked the 
medium—in this case the man’s voice——liked it because it 
was ‘‘real’’ and ‘‘earnest.’’ Those who liked these parts for 
their content did so because, as they stated, ‘‘That was my 
thought exactly,’’ ‘‘I liked this because I know it’s true . . .,’’ 
and ‘‘It’s the principle that underlies that thought. .. .”’ 

But why did the listeners like these ‘‘Narrator’’ parts and 
not the others? True, even these less liked ‘‘Narrator’’ parts 
were more liked than the average non-Narrator parts of the 
program. But there are considerable differences between the 
degrees to which these eleven parts were liked. The reason 
for these differences would seem to be that there is an inherent 
difference in the subject matter (rather than the manner of 
presentation) that makes some of the ‘‘Narrator’’ parts more 
liked than the others. An examination of the contents of the 
program at these ‘‘Narrator’’ parts shows this to be true. 

The most liked Narrator parts were rather simple and direct 
formulations of the rights of Americans and the principles of 
individual liberty. 

‘‘A man’s home is his refuge . . . a secure nation is a 
strong nation. Men without fear are strong men, and 

—e men build strength in the land they love.”’ 


. all men are equal .. . their equality shall be 
oddssitil by the Bill of Rights. ” 
Such are typical abstracts from the more liked parts. The 
less liked ‘‘Narrator’’ parts, on the other hand, involved some- 
what abstruse commentaries on these principles and rights 
and elaborations on instances where they were abrogated. 
Some examples of these are :— 


‘‘Listeners—we are all guilty, we shout hatred after 
other dreams than ours—call every ‘‘ism’’ from the limbo 
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of abortive hopes, and in a nervous world, fear for our 
own safety, the safety of the world we made. But, when 
we deny these men the right to talk in public places, we 
deny free speech, the right to peaceably assemble. Only 
in death does man deny himself, and democracy lies dying 
if the people are afraid of the dreams that blossomed in 
their father’s blood !’’ 


‘*These are the days of witches flying in the minds of 
men, and fear and rumor spread from mouth to mouth 
like November blown taste of snow. .. .”’ 

Thus, though all the ‘‘Narrator’’ parts were liked both be- 
eause of their content as well as the medium, it was the char- 
acter of the subject matter that led the listeners to like five 
of these more than the others. 

In the case of the five maximally disliked parts, the opposite 
held true. The important role here was played not by the 
subject matter but by the medium through which it was pre- 
sented. At these parts, only one of the sixteen comments 
made by those who disliked them had any reference to the 
subject matter; and even this one also referred to the manner 
in which the material was presented. 

At ‘‘A,’’ where a loud chord was sounded on the organ, the 
subjects indicated that they thought it ‘‘too weird’’ and ‘‘too 
theatrical.’ 

Similar reasons were also given at the other ten points at 
which the organ sounded, and at each there was likewise a 
drop in the number of likes and a dip in the number of dis- 
likes. At none of these points, however, was the program 
disliked as much as at point ‘‘A.’’ The cause for the indication 
of dislike being so marked at this point would seem to lie in the 
nature of the material that immediately precedes it. This 
preceding part contained a proclamation by the Town Crier 
concerning the limitation of free speech, which was also 
markedly disliked. The organ chord, coming after such a 
disliked part, made the dislike line go down to an even lower 
point than otherwise. 

The unfavorable reactions at the ‘‘chorus’’ parts, ‘‘D,’’ 
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**G’’ and ‘‘H,’’ as indicated by the listeners’ comments, seem 
to be mainly due to the sound of the voices. ‘‘Rasping 
erowd’’ was the way one subject characterized them. The 
reason for the dislikes here depended on the way the material 
was presented. 

It should be noted, however, that though these parts called 
out more dislike judgments than other parts of the program, 
they at the same time elicited as many or more like-judgments. 
In other words, there were as many or more people who 
‘‘liked’’ these parts as those who ‘‘disliked’’ them. This is 
not a discrepancy. The indication here is that even these 
outstandingly disliked parts were accepted favorably by many. 
But why? 

Part of the answer would seem to be that those who accepted 
these sections of the program did so because of their favorable 
attitude to the program as a whole and more specifically to the 
parts of the program immediately preceding these disliked 
points. In other words, the feeling of approval for the well- 
liked parts was carried over to succeeding sections of the 
program which might otherwise not have been so favorably 
received. 

This carry-over effect can be shown by comparing the 
median length of the like-reaction which precede and extend 
into or begin during each of the five periods of maximal likes, 
with the median length of the like-reactions which similarly 
precede the five periods of maximal dislike. In the former all 
the medians are greater than 26 seconds while in the latter no 
median is greater than 25 seconds. Combining these medians 
for the five disliked periods gives an average of 39 seconds; 
combining those for the liked parts gives an average of 14 
seconds—a difference which is statistically reliable even for 
the small number of cases involved. The actuality of the 
carry-over effect is thus not only established but the fact that 
it is so much more operative during the maximally disliked 
periods serves to explain the number of likes that were being 
registered during these periods. 








THE ‘‘PROGRAM ANALYZER’’ 739 


Another aspect of this carry-over effect is reflected in the 
comments made by those who liked the three chorus parts. 
With but one exception, these subjects indicated that they 
liked these parts because of the subject matter which gave 
them a feeling of patriotic pride. 


**It indicates the American way of life... . My an- 
cestors have lived here since 1678... . ”’ 
**T was taking a certain amount of pride... .”’ 


Now the subject matter, as has already been shown, was the 
main reason given for liking other parts of the program also 
(for the basic principles underlying all parts of the program 
were the same). It is therefore not surprising that those who 
liked the program because of its content should continue liking 
it regardless of whether the material is presented through the 
medium of a ‘‘Narrator’’ (whose voice some found so likable) 
or by means of a chorus (which others found so annoying). 
Objective corroboration for this explanation is contained in 
the histogram where it is found that though there are drops 
in the line of ‘‘likes’’ at these ‘‘disliked’’ parts the extent of 
these drops is not commensurate with the extent of the dips 
in the line of ‘‘dislikes.’’ Those who like a given part tended 
to continue liking the program for some time after that part 
was over. 

The reason given for disliking the dramatization at ‘‘F’”’ 
again chiefly stressed the manner of presentation. ‘‘Loud 
and harsh’’ and ‘‘shouting’’ were the descriptive terms used. 
The number of comments, however, is too small to permit a 
more specific analysis. Nor can any further light be gleaned 
by a comparison with the other two dramatizations in the 
program since they are just not comparable in either manner 
of presentation or specific content. 

Summarizing, the testing of this Public Affairs Weekly pro- 
gram by means of the Program Analyzer resulted in two 
groups of findings. 


1. The identity and location of the most ‘‘liked’’ and ‘‘dis- 
liked’’ parts: 
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a. Five of the eleven parts where the ‘‘Narrator’’ 
speaks were liked more than any other parts of the 
program. 

b. One of the points where the organ is played, three 
of the chorus parts, and the dramatization in which 
an open air meeting is broken up were the points at 
which the greatest number of listeners registered 
dislike. 

2. Specific reasons why these parts were liked or disliked 
as derived from an analysis of : 

a. The characteristics of the like-dislike reactions 
which indicated a carry-over effect that caused the 
disliked sections to be less disliked when they fol- 
lowed a liked part. 

b. The content of the program at these parts which, 
in the case of the ‘‘Narrator,’’ showed that the 
simple and direct statements were more liked than 
involved or abstruse commentaries. 

e. The comments made by the subjects which showed 
that the reasons given for liking the various parts 
of the program were about equally divided between 
references to the subject matter and the manner of 
presentation, while the reasons for disliking, 
stressed chiefly the medium through which the 
parts were presented. 


In short, the Program Analyzer, by isolating and evaluating 
the factors that were operative in determining the reactions 
to specific program items has given an objective answer to the 
question: What is it about this particular program that was 
liked or disliked, and why? 

Other programs, whether they be of the variety, musical, 
drama, or educational types, can all be similarly studied. 

In conclusion it should be pointed out that the technique 
itself still needs to be further investigated with regard to the 
following aspects : 

1. The reliability of a particular set of findings obtained 
through its use. This can be determined either by the re- 
testing of the same subjects or the method of split halves. 

2. The nature of group differences. For example, whether 
the young tend to like more parts of a program than their 
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seniors, or whether the better educated listeners tend to be 
more critical and hence give shorter like-dislike reactions. 

3. The establishment of standard like-dislike indices. 

4. The application of the technique in controlled experi- 
ments. In such, one part of a total program could be sys- 
tematically varied with respect to one factor, such as content, 
manner of presentation or location, while the rest of the 
program remained constant. 

















AN EXPLORATORY STUDY OF THE 
RELIABILITY OF THE ‘‘PROGRAM 
ANALYZER”’ 


HORACE SCHWERIN 
Raymond Spector Company, Inc. 


HE practical use of the Lazarsfeld-Stanton program 
analyzer as an instrument for the evaluation of the 
component parts of a radio program is dependent upon 

the reliability of the scores obtained. The reaction of 38 
persons on the analyzer have been subjected to various statis- 
tical treatments in order to bring to light some of the problems 
affecting a study of reliability. In view of the size of the 
sample none of these results can be accepted as conclusive but 
their general tenor is at least encouraging. 

The 38 subjects were selected at random from a group of 
middle-class persons who had visited a radio broadcasting 
station. The total sample was divided into two groups—A 
and B—of 19 persons each. In each group there was a 
roughly equal number of men and of women, and a roughly 
equal number of people above and below 30 years of age. 

The program used was a news program which was about 
two weeks old. The instructions given were similar to those 
reported by Peterman. Only, in view of the nature of the 
program, the subjects were not asked whether they liked or 
disliked it, but whether they were strongly interested or not 
interested in it. The green button was pressed in cases of 
strong interest and the red button if the subject was not inter- 
ested ; in cases of mild interest, no button was to be pressed. 

The news broadcast lasted fifteen minutes and three seconds 


1‘*The Program Analyzer: A New Technique in Studying the Liked 
and Disliked Items in Radio Programs,’’ this issue, page 728. 
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and for the purpose of this analysis was arbitrarily divided 
into 24 items of different time-lengths. (There was no indi- 
cation of those divisions in the actual test run.) The divisions 
were made so as to include a single news item, two or three 
very similar news items (e.g., all concerned with air raids), 
or commercial announcements. There were 16 news items and 
8 non-news items. 

Tabulations were made by items. The strong interest score 
was taken as the total number of seconds that all persons in 
the group registered strong interest for a specific item. The 
mild and non-interest scores were computed accordingly. The 
sum of the three scores for any one item would always equal the 
number of seconds in the item multiplied by the number of 
persons in the sample. These three scores were then changed 
to a percentage basis for each item, thus eliminating differ- 
ences due to the varying length of the item. 

Curves similar to the one reproduced on page 733 of this 
issue were constructed for the program, giving the rise and 
fall of interest- and no-interest-reactions as the program went 
on. An inspection of those curves drawn for the same pro- 
gram but constructed separately for the two groups A and B 
showed a great sithilarity in trend, thus indicating that the 
two sample groups reacted in a rather similar way. The 
curves are not reproduced here because it is possible to com- 
pare the reactions of the two samples in simple quantitative 
form. 

For this purpose the items of the program were ranked 
according to the strong interest score and no-interest score 
just explained. The rank orders of score values thus result- 
ing for the two samples were very much alike. The strong 
interest scores for the 24 items, if groups A and B were com- 
pared, had a Pearson r correlation of .89. For the same items, 
if ranked by the no-interest score, the two samples showed a 
correlation of .93.? 


2In order to see how small a sample could be taken, Group B was once 
more subdivided at random into two further samples containing 9 and 10 
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In view of the small number of cases (19 in each sample) 
this high agreement in the relative evaluation of the 24 pro- 
gram items is very surprising. Even in a homogeneous group 
it cannot be expected that so small a sample will always be 
sufficient for a program test; and, of course, if it comes to the 
study of group differences, the samples will have to be consid- 
erably larger. It can be shown that the high agreement found 
with the present program is partly due to the fact that the 
program items showed great variations in interest for the 
listeners ; two samples, as a matter of course, are more likely 
to agree on the relative interest in program items if the differ- 
ences between items are very great, than if they are small. 

In order to analyze this problem in a concise way, it is desir- 
able to use a measure which combines the strong-interest and 
no-interest scores into one index. As such an index, the 
Lazarsfeld-Robinson trichotomy formula was used.’ 

It turned out that Sample A had an average score of .28, 
whereas Sample B had an average score of .51 for all the 24 
items of the program. Thus we see that, by and large, Sam- 
ple B liked the program considerably better than Sample A; 
still the relative interest in the 24 items was very much alike, 
as could be inferred from the high correlation coefficients pre- 
sented above. Furthermore, it was found that the 16 news 





people respectively. For those two groups the correlation between the 
strong interest scores was .89 and between the no-interest scores, .85, 
thus showing that with a rather homogeneous group of people even very 
small samples can give a reliable ranking of program-items according to 
the interest they arouse. 

8See Paul F. Lazarsfeld and W. S. Robinson: ‘‘Some Properties of 
the Trichotomy, ‘Like, No Opinion, Dislike’ and their Psychological 
Interpretation.’’ Sociometry, Vol. III, No. 2, 1940. p. 151. The tech- 
nique consists essentially in assuming that the proportion of interested 
and the proportion of not-interested people correspond to the two tail 
ends of a normal distribution curve. The index then locates the center 
of the distribution curve on a scale of negative and positive attitudes. 
An index value of zero corresponds to a neutral attitude. The higher the 
index, the more favorable is the average attitude of the group under 
investigation. 
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items proper were much better liked than the 8 non-news items 
(announcements of various kinds included in the program). 
For the news items alone Sample A had an average score of 
.60 and Sample B of 1.05. The corresponding indexes for non- 
news items were —.36 and —.58 respectively, which shows that 
they were definitely uninteresting to the test listeners. 

If we study the 16 news items alone, then we deal with items 
between which there are much fewer variations in the listen- 
ers’ reactions than if we include the non-news items, which 
are generally uninteresting and therefore introduce the large 
variation between the items. As a result, if we compare the 
reaction of our two samples, A and B, the 16 news items alone 
show a correlation of only .63. This will lead us definitely to 
expect that for programs consisting of rather similar items, a 
larger group of test persons will be needed to get reliable 
results. Tests comparing samples of 40 people each are just 
now under way. 














THREE TYPES OF ‘‘LIKE’’ REACTIONS IN 
JUDGING POPULAR SONGS 


CUTHBERT DANIEL 
Office of Radio Research, Columbia University 


E polygraph, or program-analyzer, described elsewhere 
in this volume, has been used in a preliminary study to 
determine what parts of popular recordings are best 

liked. A multitude of indices of liking spring immediately to 
mind. The duration of liked parts, the proportion of the whole 
piece liked, the number of responses corresponding to different 
aspects of the piece, stratification of the likers by age, sex, 
musical education, familiarity with the piece played, correla- 
tions of responses with recognizable elements in the music, sub- 
grouping by types of music, and many other indications of the 
meaning of liking might well be tried. 

Since the present study used only 9 subjects, and these not 
representative of any known audience, only one interesting 
index will be analyzed here. The lengths or durations of the 
responses were studied. Table 1 gives the distribution of 
responses by duration for nine subjects hearing seven sides of 
popular recordings. 

It is clear from this table that the responses of three seconds 
or less constitute a special group. The responses of length 5 to 
6 seconds are present in unexpectedly high frequency, consider- 
ing’ the small number found in each of the one-second ranges 
above and below this range. Finally there is the rambling 
group of longer responses. 

Inspection of the music itself, and of what is being played 
when these responses are being made, permits a different inter- 
pretation to be given to each of these three groups of responses. 


1 The experimental work reported in this study was done by Mr. Ger- 
hart Wiebe, with the assistance and supervision of Dr. T. W. Adorno. 
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REACTIONS IN JUDGING POPULAR SONGS 


TABLE 1 
Distribution of Responses of Different Lengths 





Ti Ti 
roy Tone 9 Frequency in Be ——- Frequency 





0 16 7- 8 
0-1 17 8-— 9 
1- 2 23 9-10 
2- 3 24 10-15 14 
3- 4 5 15-20 16 
4— 5 5 20-25 5 
5— 6 17 25-30 12 
6— 7 5 over 30 22 





1. Those of less than three-second duration represent oc- 
casions when the listener changed his mind. He thought he 
was going to like the following section (generally the beginning 
of a new chorus) and found in less than 3 bars, that he did not 
especially care for it. These are to be called non-serious 
responses since they did not correspond to a part of the music 
actually liked. 

2. The responses that cluster in the 5- to 6-second range are 
of just the length of a four-bar section, so characteristic of 
modern jazz. They represent, then, liking of the easily recog- 
nized sections of choruses. 

A detailed analysis of the parts of the records that were 
liked by 5 or more of the nine listeners (probability of coin- 
cidental chance liking less than 0.05) shows that all these espe- 
cially liked parts are clear statements of the chorus, i.e., of the 
main tune, with embellishment subordinate, harmony regular, 
and jamming never so free as to diminish the recognizability of 
the tune. This analysis is not published here because of the 
known unrepresentativeness of the sample, but its main finding 
is worth mentioning in support of the interpretation of the 
5- to 6-second responses. 

3. The remaining responses of longer duration show no 
clustering but only a fairly uniform distribution over the 
remaining possible range. The longest response was of 142 
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seconds duration, 85 per cent of the entire duration of the 
recording. These responses represent more nearly the excited 
reaction of the jitterbug, who, as he would say, gets solid, stays 
in the groove, and is out of the world for whole choruses, 
choruses repeated, and even through several repetitions. 

Statistical test shows that there is no significant difference 
between subjects in the number of less-than-3-second responses. 
There is wide difference between subjects, however, in the other 
two types of response. 





STUDIES IN RADIO EFFECTIVENESS BY THE 
PSYCHOLOGICAL CORPORATION 


HENRY C. LINK anp PHILIP G. CORBY 
Psychological Corporation 


EFFECTIVENESS US. POPULARITY OF RADIO PROGRAMS 


S this account of studies by the Psychological Corpora- 

tion is being written, we are conducting a study using 

a technique developed in 1933. At that time, and in 

the few preceding years, the emphasis in the study of radio 

broadcasts by advertisers was on the measurement of the size 

of the audience in terms of the Cooperative Analysis of Broad- 

casting. On the assumption that a program, in order to be 

effective in selling its product, must not merely be popular but 

must register the product advertised in the minds of its listen- 

ers, we introduced in one of our Brand Barometer studies of 
1933 a series of questions in pairs somewhat as follows: 


a) Have you been listening to the Show Boat radio program 
on Wednesday nights? 
b) (If Yes) What product does this program advertise ? 


This technique grew directly out of the Triple-Associates Test 
which we had developed a year earlier and which is repre- 
sented by such questions as: 

What flake soap advertises : ‘‘Stop those runs in stockings’’? 
In our first test of eight popular radio programs, the results 
were quite startling. For instance, two radio programs of 
almost equal popularity were found to differ one hundred per 
cent in the extent to which listeners were able to identify the 
product advertised by the program. That is to say, twice as 
many people were able to identify the product advertised in 
the case of one program as in the case of the other program. 
Here immediately was a collection of evidence which showed 

749 











750 HENRY C. LINK AND PHILIP G. CORBY 


rather conclusively that the size of the radio audience was not 
a reliable measure of the effectiveness of the program in reg- 
istering the product it was trying to sell. 


SALES EFFECTIVENESS OF RADIO PROGRAMS 


The more complete and satisfying measure of effectiveness 
is of course one in terms of sales. However, before a person 
can be induced to buy, some sort of mental or emotional im- 
pression must be made on him. The fact that a program 
listener can identify the product advertised on that program 
is one good indication that some such impression has been 
made. It happens that we were able to test simultaneously 
both the extent to which a program had registered its product 
in the minds of listeners and the extent to which it had affected 
the brand purchases of those listeners. 

Our periodic Brand Barometers, which have been continu- 
ously conducted since 1932, were already giving us a measure 
of the percentage of people buying a given brand or a given 
product. For example, they provided a periodic record of the 
percentage of people buying the various brands of dentifrice, 
the various brands of shaving soaps and creams, toilet soaps, 
cigarettes, etc. Now, in a Brand Barometer asking what 
brand of dentifrice had been purchased last, we included also 
a question asking whether the interviewees listened to a cer- 
tain dentifrice radio program, and a question as to what brand 
of dentifrice was advertised by that radio program. Then, 
by breaking down the listeners into those who could and those 
who could not identify the program, as well as those who did 
not listen at all, and calculating the percentage of people in 
each of these groups who bought the dentifrice in question, 
we obtained definite clues to the sales effectiveness of the radio 
program. 

Recently, a client who had a radio program with a 25 point 
Crossley rating decided to advertise one of his products in 
one-half of the country, and another product in the other half 
of the country. He then had special studies made by two 
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research organizations, the results of which agreed in showing 
that the radio program was highly effective, in fact far more 
effective than all sales indications permitted the client to be- 
lieve. A breakdown of the Nielsen Index, however, showed 
that the sales effectiveness of this radio program in each half 
of the country was perceptible but only moderate in extent. 
A breakdown of our Brand Barometer results for these prod- 
ucts, by the same two broad territories, also showed a degree 
of sales effectiveness for each brand in each half of the coun- 
try, but again in only moderate degree. 

A summary of many radio advertising studies made by the 
Psychological Corporation reveals primary emphasis on the 
qualitative rather than quantitative aspects of radio programs. 
This emphasis has usually included: (1) a measure of the ex- 
tent to which the program is impressing the product and its 
desirability upon the mind of the listener; (2) a measure of 
the extent to which these impressions are being converted into 
new customers as measured by such instruments as the Brand 
Barometers or research devices by which sales results can be 
measured not only in terms of sales volume but in terms of 
the psychological unit of influence, that is, customers. 


THE RELATIVE EFFECTIVENESS OF RADIO AS AN 
ADVERTISING MEDIUM 


In the spring of 1933 the Psychological Corporation made 
its first major study for the National Broadcasting Company. 
Its title was ‘‘A Study of the Relative Effectiveness of Major 
Advertising Media.’’ Its purpose was to compare the relative 
effectiveness of four important media—radio, newspapers, 
magazines, and billboards. The study was made in the fields 
of toilet goods, groceries, and gasolines. After a careful test- 
tube study to develop a reliable technique, it was decided to 
base the study on interviews with 2,720 dealers. In a sense 
this study might be called a poll of public opinion among deal- 
ers with regard to the relative effectiveness of the four media 
in question. However, it went beyond the usual poll of public 
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opinion in that a method was developed by which to obtain 
objective verification of the opinions expressed. For instance, 
each dealer was asked to name three of his best selling nation- 
ally advertised brands. He was then asked, for each brand, 
‘“‘What kind of national advertising has helped these sales 
most?’’ From more than a thousand brands mentioned, the 
four hundred nationally distributed brands most frequently 
mentioned were taken and the influences mentioned by dealers, 
that is, radio, newspaper, etc., were analyzed, and the results 
presented in detail. These results showed, briefly, that the 
dealer was highly accurate in his statements of the specific 
advertising campaigns present in connection with the brands 
he mentioned. 

Moreover, the evidence obtained showed conclusively that 
the dealers’ knowledge of the current advertising for various 
brands and its effectiveness came directly from his contacts 
with the consumer and remarks made by the consumers in the 
acts of purchasing. In short, although this study was based 
on interviews with dealers, objective evidence showed that it 
was really a measure of the advertising influence of radio pro- 
grams upon the consumers. In fact, it was originally agreed 
that this information could not be obtained by approaching 
the consumers directly, and the dealer approach therefore 
represented an indirect means of measuring the influences of 
advertising on consumers. Naturally, in view of the results 
of this study, which showed that radio was overwhelmingly the 
most effective advertising medium at that time, the reliability 
and validity of the techniques employed had to be most care- 
fully worked out and verified in advance. 

Almost two years later this study was repeated for the 
National Broadcasting Company, using exactly the same tech- 
nique and with 2,518 dealers. Similar results were found. 
Radio was still by far the most effective medium in the field 
of toilet goods, groceries and gasoline. In the first two fields 
the effectiveness of radio showed a significant increase, in the 
third a decrease. 
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Per Cent of Dealers Who Considered Each Medium Most Effective* 











Druggists Grocers Gasoline 
1933 1934 1933 1934 1933 1934 
|” St CSS 65.0 70.3 58.3 62.3 69.4 63.2 
Magazine ................ 6.9 7.1 6.8 8.1 4.8 7.9 
Newspapers ............ 24.6 20.3 31.3 27.2 14.5 14.1 
Billboards ................. 0.6 1.0 1.3 2.0 6.7 12.8 





* Reproduced with permission of the National Broadcasting Company. 


The practical judgments of business men in their increasing 
use of radio for advertising since 1934 have given empirical 
vindication to the rather startling results which our original 
study produced. 


PANEL STUDIES OF RADIO PROGRAMS 


We had for some years been conducting panel studies as a 
means of testing products. However, in 1936, Dr. H. P. 
Longstaff conducted a radio panel study for one of our clients 
based on a panel of 100 housewives who were paid to listen to 
a certain program carefully and regularly for a period of six 
weeks. The most significant thing about this panel probably 
was the fact that it was selected not from the housewives who 
were already listening to the program, but largely from house- 
wives who were not listening or who, if they had listened a 
few times, had stopped. The object of this study was to dis- 
cover the reasons for the failure of this program to reach a 
larger audience, as well as to obtain clues by which it could 
be improved, both in its entertainment value and in its selling 
effectiveness. As a result of this study the program was 
radically changed. 

Since that time the Psychological Corporation has conducted 
a considerable number of panel studies in the field of radio. 
Some of these studies have been made in connection with radio 
programs intended primarily as institutional programs calcu- 
lated to develop the good-will of the public. Other studies 
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have been in connection with radio programs calculated pri- 
marily to sell a product. Although panel studies are some- 
times made on the basis of mailed questionnaires, our studies 
have demonstrated that the most practical and reliable man- 
ner in which to set up a panel and maintain the necessary 
controls is through personal interviews supplemented by tele- 
phone calls. Also, these panel studies have usually included 
not only the people already listening, but a proper proportion 
of people who are not listeners or who, having been listeners, 
have stopped. It is from these latter groups, the non- 
subscribers or non-listeners, that the most critical and helpful 
information has often been obtained. 

Panel studies, while presenting certain technical difficulties 
in operation, have proved, in our experience, to be extremely 
fruitful in helping the broadcaster to strengthen the best 
points of his program, eliminate the weak elements, point up 
his selling message, and work out the kinks which are bound 
to occur from time to time. 


EDUCATIONAL BROADCASTS 


Are the radio stations giving enough time to educational 
broadcasts? This question has become the subject of frequent 
controversies and discussions between the broadcasting com- 
panies, the Federal authorities, and educators throughout the 
country. In 1934 when this question was being raised with 
particular insistence by the Federal Radio Commission, the 
National Broadcasting Company asked us to make a study of 
the listening habits in six communities where educational 
broadcasts had been specially developed. These communities 
were chosen with reference to such outstanding educational 
stations as those operated by Ohio State University, Iowa State 
University, University of Wisconsin, etc. 

The method developed for this study was one based on per- 
sonal calls through which were obtained the names of the 
programs and stations listened to every day for a pertod of 
one week. After the programs listened to had been recorded, 
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the following question was asked: ‘‘Which of these programs 
do you find have educational value?’’ The interviewer was 
asked to name each program already entered and to check 
those described by the listener as being educational in his 
opinion. | Moreover, if this question in regard to educational 
programs reminded the listener of any program which he had 
previously forgotten to mention, and which had not yet been 
entered on the interview blank, the interviewer was instructed 
to add certain programs to those already entered. In other 
words, every effort was made to obtain the most favorable 
results possible with reference to the broadcasts of the educa- 
tional stations as compared with the commercial stations. The 
results were remarkable in two respects: 

First, the more strictly educational programs of the educa- 
tional stations had very small audiences. 

Second, the listeners designated as educational a substantial 
number of commercial programs which had hitherto been 
thought of primarily as entertainment programs, not as edu- 
cational programs. Programs designated as definitely educa- 
tional included popular news broadcasters such as Lowell 
Thomas, many musical programs, especially those of a more 
serious nature, certain farm programs which gave information 
in respect to markets and prices, certain dramatic programs, 
certain series of talks by men prominent in their field, many 
religious programs and sermons, many programs on household 
management, cooking, food values, etc., programs on the rear- 
ing of children, etc., ete. 

In short, whereas a great variety of programs, educational 
in the conventional sense and given by conventional educators, 
obtained a very small audience, a very considerable number 
of highly popular commercial programs with no pretensions 
to formal education and not in the hands of professional edu- 
eators, were positively labeled by the listeners as being 
educational. 

STATION POPULARITY 


One of the earliest and most extensive types of radio re- 
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search concerns itself with station popularity as contrasted 
with program popularity. This type of station research was 
usually based on a question like: ‘‘Which station do you lis- 
ten to most frequently?’’ An interesting byproduct of the 
study of educational broadcasting just described came as a 
result of a question which was included in this study, namely : 
‘*What stations do you or your family tune in on most fre- 
quently?’’ When the relative popularity of stations was 
computed on the basis of this question, and compared with 
the popularity of stations derived from an analysis of the pro- 
grams to which people said they had listened, radical differ- 
ences were discovered. For instance, 75 per cent of a popu- 
lation may have named a certain station as the one to which 
they listened most frequently ; but when the listening time was 
computed in terms of actual programs mentioned, this station 
obtained a popularity rating of only 40 per cent. The reverse 
was also found to be true, that is, a station mentioned as being 
the most frequently listened to by 15 per cent of the people, 
on the basis of programs mentioned was found to have a popu- 
larity of possibly 28 per cent. Such findings were not the 
exception but the rule. These findings were confirmed in a 
comprehensive and elaborate manner in the Buffalo Radio 
Audience study to be described below. 


RADIO AUDIENCE STUDIES 


We have made studies of the size of the radio audience by 
practically all the known methods, the coincidental telephone 
method, the coincidental personal interview method (now 
being used by our Chicago Office), personal interviews imme- 
diately following a broadcast, several hours after a broadcast, 
the next day, and several days later; the roster method in 
various forms. The method used has usually been determined 
by the specific purpose which the broadcaster had in mind. 
However, in view of the importance of such general yardsticks 
as the Cooperative Analysis of Broadcasting, the Heoper 
broadcasting surveys, and possibly others, we have made tenta- 
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tive attempts to evaluate the various methods in order that the 
best possible or the most comparable possible standard might 
be arrived at. 

The method which we have found especially useful from 
this point of view is the printed roster method. This method 
was utilized in one of the most comprehensive audience studies 
we have ever made, namely, The Buffalo Radio Audience, 1939. 
This study was based on personal interviews in 5,177 homes, 
5,041 of which had radios in working order. Every period of 
the day, and all programs of all local stations were covered. 

One of the most important aspects of this study was that it 
represented an absolute sampling of the population. That is 
to say, it covered all income groups, language groups, and 
localities. Such a sampling is obviously not possible when 
telephone calls are relied on. When the results were com- 
puted it was found that 45 per cent of all families interviewed 
had private telephones while the remainder did not, and this 
result was in fairly close agreement with the actual facts of 
telephone distribution. Nevertheless, practically every home 
without a telephone had a radio. 

Moreover, when the popularity of the various programs was 
computed, major differences were discovered in the relative 
popularity of different programs among the homes having 
telephones and the homes not having telephones, or roughly 
speaking, between upper and lower income levels. This study 
was made primarily for commercial purposes. However, its 
cultural and broader implications have been admirably devel- 
oped with considerable detail in the recent book by Lazarsfeld, 
Radio and the Printed Page. 

Incidentally, this study also gave a most complete and 
detailed confirmation of the extent to which the popularity 
rating of the smaller and less known stations is raised to its 
true level when measured on the basis of the programs actually 
listened to. 

















IV. General Research Techniques 
WHO ANSWERS QUESTIONNAIRES? 


EDWARD A. SUCHMAN anp BOYD McCANDLESS 
Office of Radio Research, Columbia University 


GREAT many of the current surveys of the opinions 

and habits of various groups of people are conducted 

by means of mail questionnaires. In those surveys it 
has often been assumed that it is simply a matter of chance 
whether one individual or another answers the questionnaire. 
Several studies have doubted the validity of this assumption.* 
However, there is little actually known about the differences 
between those who reply and those who do not. 

In the two studies to be reported it was possible to compare 
those people who answered a mail questionnaire without fol- 
low-ups or further urging—representing the usual type of 
mail analysis—with the composition of the entire group to 
whom the questionnaire had been sent. In this way we could 
investigate those factors which motivate the return of ques- 
tionnaires and thus arrive at some estimation of the bias intro- 
duced into a study of incomplete returns. 

The first study to be discussed had as its main problem 
listening or non-listening to a child training program broad- 
east over Station WSUI, lowa City. A random list of 600 
women was seiected from the telephone directories of Cedar 
Rapids and Iowa City, both in Iowa. A questionnaire was 
sent to these 600 women and after not more than one return 
per day had been received for five successive days, a second 
wave of the same questionnaire was dispatched to all those who 

1 For a more complete introduction to this problem, and references to 
other reports on it, see Stanton, Frank: ‘‘ Notes on the Validity of Mail 


Questionnaire Returns.’’ Journal of Applied Psychology, Vol. XXIII, 
No. 1, February, 1939. 
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had not answered. After a similar period of waiting, the fol- 
lowing technique for studying the non-respondents was em- 
ployed: the telephone directory was consulted a second time, 
and a random sampling of every second name from those who 
had not answered was taken; these women were then called by 
telephone, and the returns from these telephone calls were con- 
sidered to be representative of the portion of the original sample 
which had not returned questionnaires. 

The technique used in this study, then, proceeds in three 
steps—first a regular mail questionnaire is sent; second a fol- 
low-up mail questionnaire is sent to all not answering; and 
finally a spevial telephone survey is made of a random sample 
of the remaining portion of the mail sample. The number of 
returns received each time is given in Table 1. 


TABLE 1 


Proportion of Replies to Each Wave of the Questionnaire 





Number Per cent 





Wave sent replies 
First wave (rail) o.cccccccccsssseesnenen 600 16.8 
Second wave (Mail) ccc 490 34.1 


Residual sample (telephone) ......... 141 97.2 





As a result of the above mail questionnaire waves and the 
random telephone follow-up, we are able to compare the group 
which answered the original questionnaire, representing the 
usual type of analysis made, with the group which did not 
answer.” The first factor to be investigated will be familiarity 


2 In statistical manipulations with returns from the random telephone 
follow-up we shall operate on the following assumptions: (1) that our 
actual sample had 550 cases (50 cases were physically unable to be reached 
due to moving, illness, ete.) ; (2) that the 137 returns from the telephone 
calls represent those people who did not answer the mail questionnaires. 
Then returns to the mail questionnaires represent roughly 50 per cent of 
the sample (actually 48.7 per cent); and the residual sample, again 
roughly, represents the remaining 50 per cent of the sample (actually 
51.3 per cent). On the basis of this 1: 1 ratio, it is possible for us to 
arrive at a rough estimate of the answers for the total sample. This will 
be given in the tables as the ‘‘ Total Weighted Sample.’’ 
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with the program being studied. Table 2 shows us the effect of 
familiarity upon the return of the questionnaire. 
TABLE 2 


Proportion of Respondents Who Knew or Did Not Know of Program 
Answering the Original Questionnaire 





Per cent of those an- Total 





Familiarity with program swering who did so weighted 
on wave sample* 
Knew Of program ...cccccccccnnnnnen 33.8 195 
Did not know of program .......... 10.1 347 





* See Footnote 2, page 759. 


The recipients of the questionnaire who knew about the pro- 
gram responded to a much greater extent than those who had 
no knowledge of the program. If we had stopped with the 
original returns, we see that we would have seriously overesti- 
mated the proportion of the total sample who knew about the 
program. An error arising from such a conclusion might be 
an overestimation of the effectiveness of the publicity methods 
used. It would have been claimed that almost twice as aany 
women know of this program as really do. Here we have an 
example of how drastically knowledge about the subject under 
investigation affects the returns of questionnaires, and biases 
the subsequent analysis and interpretation. 

Are there other factors affecting questionnaire returns? 
The influence of education is quite marked, as can be seen from 
Table 3. Whereas almost one out of every two of the respon- 
dents with a college education returned the questionnaire the 
first time, only one out of every five with a high-school educa- 
tion and one out of every ten with a grammar school education 
did so. 

However, in computing the effect of education on the ques- 
tionnaire returns there is reason to suspect that education is 
but another aspect of knowledge of the program. We knew 
from other studies that familiarity with an educational broad- 
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TABLE 3 


Proportion of Respondents on Different Educational Levels Answering 
the Original Questionnaire 











Per cent of those an- Total 
Education swering who did so weighted 
on first wave sample 
College amd above ..n..cccccccceccsneensin 44.2 43 
PE I irene 19.5 314 
Grammar school or less ................ 11.1 135 





east is highly related to education.* Perhaps what we are 
measuring now is not the effect of education but rather another 
aspect of familiarity with the program. Table 4 reveals 
clearly that education of the respondent remains an important 
factor in addition to familiarity in determining the distribu- 
tion of returns such as these. Table 4 tells us that, for both 
the informed and uninformed groups, relatively more first- 
wave blanks were returned by the high education group than 
by the middle or the-low education group. Similarly for each 
of the educational groups, more returns were forthcoming 
from the informed than from the uninformed group. Con- 
sidering both factors simultaneously, the percentage of highly 
educated informed listeners who returned first wave question- 
naires in this case was more than eight times as great as the 
percentage of less educated, uninformed, non-listeners. 

To summarize then, two factors appear to have influenced 
returns to this survey, namely familiarity with the broadcast 
and education of the respondent. Our results indicate that 
questionnaires will be returned in directly decreasing ratio to 
familiarity with the topic under investigation, and education 
of the respondent. 

Additional evidence on these factors affecting questionnaire 
returns can be gathered from the second of our studies, which 
deals with serious music broadeasts. In this study a random 
list of 900 listeners was selected from the 10,000 subscribers to 


8 Lazarsfeld, Paul F., Radio and the Printed Page. Duell, Sloan & 
Pearce, 1940. p. 19. 
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TABLE 4 


Proportion of Respondents to Original Questionnaire According to 
Education and Knowledge of Program* 





Per cent of total group answering 
original questionnaire 

















Education : 
Knew of program Did ae of 
College OF ABOVE 2..ccccccccmecennnn 59.2 18.8 
High school 30.7 10.7 
Grammar school or less .......... 22.3 7.1 
* Base figures for percentages given in Table 4. 
Knew of program Did not know 
College 27 16 
Fligh Shoo] 2... ccccoeeoine 127 187 
Grammar school ................ 36 99 


the Masterwork Bulletin, a booklet listing the music broad- 
casts of Station WNYC, New York City. We begin, then, 
with a group we know is interested in the questionnaire, as 
opposed to the unknown group picked at random in the pre- 
vious study. 

From the beginning, every possible effort was made to secure 
the cooperation of all respondents on the list. The original 
questionnaire was sent out three times, each time accompanied 
by a more urgent request for a response. For the fourth wave 
the questionnaire was shortened and only the key questions 
were asked. 

The technique employed in this study, then, consisted of 
follow-up mail questionnaires until the sample of names had 

4 While 900 was the original number of subscribers selected, it was nec- 
essary to eliminate, from the beginning, 41 names as impossible to contact 


due to such reasons as ‘‘moved, no forwarding address,’’ ‘‘unknown,’’ 
‘*deceased,’’ ete., and 39, which while the receipt of the questionnaire was 
acknowledged, could not answer due to ‘‘illness,’’ ‘‘not a subseriber, don’t 
listen,’’ i.e., friend sent in name, ete. It was felt that these 80 did not 


really belong to the sample and they were eliminated, making the tetal to 
be studied 820. 
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been virtually exhausted. The returns for each wave of the 
questionnaire are given below in Table 5. 


TABLE 5 


Proportion of Replies to Each Wave of the Questionnaire 








Ware Questionnaire Per cent Cumulative per 
sent returned cent of total 
STE ctanheapiciincins 820 44.3 44.3 
Second 00.0.0... 457 46.4 70.2 
OR cian 245 50.3 84.7 
OED cnrecinrsin 122 66.5 95.1 





As in the first study, we separated out those individuals 
who replied to the first wave and analyzed them for those char- 
acteristics which influenced the return of the questionnaire. 
Two factors were found which significantly affected the re- 
turns, namely personal experience with the problem and gen- 
eral interest in the topic. Due to the highly selected nature of 
the sample in this case, it was impossible to study the factor of 
education. There were very few subscribers to the Master- 
work Bulletin who had less than a high-school education.® 

The effect of personal experience with the problem was stud- 
ied by analyzing for each respondent the role played by radio 
in the development of his own interest in music. We should 
suspect that where radio played a more important part in the 
development, we would find a greater response to the ques- 
tionnaire. The respondents were classified into three groups, 
an ‘‘initiated’’ group, where radio was directly responsible 
for the interest in music, a ‘‘nursed’’ group, where radio was 
responsible for the growth of an interest developed through 
some other medium, and a ‘“‘supplemented’’ group, where 
radio merely served as an additional source of music. 

The distribution of each of these groups in our sample was 
of primary importance to the study. Would we have been 


5 The highly selected distribution of the audience to a serious music 
broadcast can be seen from a report by H. M. Beville, ‘‘ Social Stratifica- 
tion of the Radio Audience,’’ published by the Office of Radio Research. 
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wrong in our evaluation of radio if we had stopped with our 
first mailed questionnaire? The answer can be seen in Table 
6. 

TABLE 6 


Proportion of ‘‘ Developmental’’ Groups Answering 
Original Questionnaire 





Per cent of those 
answering who Total 


Musls Govelopment did so on first answering 








wave 

€STnitiated ’? a mew imterest nn .ccccccessccsssneene 54.7 115 

**Nursed’’ a growing imterest ........cccccccconm * 47.0 289 

‘*Supplemented’’ an established interest ...... 43.6 361 
We see that the respondents who were ‘‘initiated’’ into an 


interest in music through the radio answer the questionnaire 
more readily than the ‘‘nursed’’ or ‘‘supplemented’’ group. 
In other words, if we had stopped with the first wave, we 
would have overestimated the number of listeners who account 
for their interest in music through listening to the radio. The 
difference, however, is not as high as we would have suspected. 
The reason for this will be brought out later in the report. 

The second factor affecting returns was found in the re- 
spondent’s concern with music in general. While there were 
available several measures of the importance of the topic to 
the respondent, the one which seemed to demonstrate our point 
most clearly was that of ‘‘level’’ of musical listening. The 
term ‘‘level’’ is used here purely in an experimental sense as 
signifying an artificial division into an upper and lower group. 
The respondents were asked to list their five favorite musical 
compositions and their answers were grouped according to 
whether a majority of the composers named belonged to the 
ranks of well-known, ‘‘great’’ composers, i.e., Bach, Beetho- 
ven, Mozart, or to the less ‘‘accepted’’ composers, i.e., Franck, 
Gershwin, Ravel. The results are given in Table 7. 

The respondents with a ‘‘high’’ level of musical taste re- 
spond to the first questionnaire to a greater degree than the 
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TABLE 7 


Proportion of Respondents on Different Musical Levels Answering 
Original Questionnaire 





: Per cent of those answering Total 
Level of musical taste who did so on first wave answering 





56.1 373 
44.4 178 





respondents with a ‘‘low’’ level of musical taste. This would 
seem to indicate that the more ‘‘literate’’ a person was about 
music in general the more likely would he be to answer a ques- 
tionnaire concerned with music. 

From an analysis of other parts of the questionnaire we 
know that there is an inverse relationship between level of 
musical taste and reliance upon radio for the development of 
an interest in music. We find that those individuals whose 
interest in music was cultivated through attendance at con- 
certs, playing instruments, ete., and for whom radio plays a 
minor role, have a higher degree of musical sophistication 
than those individuals whose reliance upon radio has been 
more extensive. These two factors, then, would tend to negate 
each other in the two tables presented above, since while an 
‘*initiated’’ individual would be more apt to answer, he would 
also possess a lower musical taste, a factor which would work 
in the opposite direction. It is therefore necessary to examine 
both factors independently of each other for their true effect 
upon the return of the questionnaire. This is done in Table 8. 

That these two variables tend to offset each other can be 
seen in the increased differences, in returns between the de- 
velopmental groups and the levels of musical taste that result 
when both variables are considered independently. We find 
that whereas 65.4 per cent of the ‘‘initiated’’ group with a 
‘*high’’ level of taste answer the questionnaire the first time, 
only 39.9 per cent of the ‘‘supplemented,’’ ‘‘low’’ level group 
answered. 

In studying the relationship of radio to the development of 
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TABLE 8 


Distribution of Respondents to Original Questionnaire According ‘o Role 
of Radio in Development and Level of Musical Taste* 





Per cent of each group answering 








Dish of salts Xe original questionnaire 
development ‘*High’’ musical ‘‘Low’’ musical 
taste taste 
OE ethionine 65.4 51.3 
OF EOIN, ssc iplaknintestaaieseins 58.5 44.3 
‘*Supplemented’’ _.................. 50.8 39.9 





* Base figures for percentages given in Table 8: 





High Low 
a ag OR Ce 49 43 
‘*Nursed’’ 149 72 
*¢Supplemented’’ ...................... 169 65 


an interest in serious music; then, it was possible to investi- 
gate further those factors influencing the return of mail ques- 
tionnaires. The results showed that the answering portion 
was over-represented by those individuals (a) for whom radio 
had played a more important part in the development of their 
interest in music, (b) who had a higher level of musical taste. 

Although our problem has been essentially one of investigat- 
ing the bias involved, it is worthwhile to point out an impor- 
tant administrative consideration. In both studies discussed, 
it was found that a higher total response was obtained by the 
use of a follow-up to those who had not answered than would 
have been obtained if the original sample had been doubled. 
Thus we not only increase the representativeness of our sample 
tremendously, but also decrease the cost per questionnaire, 
when a sample of a certain size is desired.*® 

6 This was also found to be the case in the study by Dr. Frank Stanton 
referred to previously. We quote, ‘‘It is interesting to note in passing 
that whereas 28.3 per cent of the first sample of teachers replied to the 
original questionnaire, 50.2 per cent of the second sample responded to 


either the original form or the follow-up—an increase of better than 75 
per cent.’’ 
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Table 9 gives us an idea of the comparative results obtained 
in the two studies discussed for the original and the first fol- 
low-up waves. 

TABLE 9 


Comparative Results Obtained with the Use of Follow-Up Questionnaire 





Per cent of replies 





Study 
Original sample Follow-up 





Parent education (First) ............... 16.8 34.1 
Music development (Second) .......... 44.3 46.4 





It appears that in dealing with an uninterested sample such 
as the telephone directory of the first study, a much higher 
return results from a second follow-up questionnaire than 
from the original request. This may not necessarily be true 
where the original sample is already highly interested, as in 
the second study. However, aside from the consideration of 
increased returns the follow-up is to be recommended in both 
cases as decreasing the bias inherent in incomplete returns 
from mail questionnaires, and in those cases where there are 
still unanswered questionnaires, furnishing the researcher 
with an estimate of the bias that might be expected, by ena- 
bling him to compare the first and the second waves of the 
questionnaire. 

If we examine the three psychological] factors influencing re- 
turns—namely, familiarity with topic, personal experience 
with problem and importance of the subject to the respondent 
—we find that these three factors seem to center around the 
general concept of interest in the topic being studied. There- 
fore, assuming that we have a group of people to whom we 
wish to send questionnaires, the most important consideration 
appears to be, ‘‘Are they interested in the questionnaire ?’’ 
We might divide all groups into those groups which are inter- 
ested and those which are not. If all the members of a group 
to be studied can be assumed to be interested in the topic, this 
factor of interest is held pretty much constant. In this case 
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there is a good likelihood of a large initial return, and the 
answering group will be much more representative of the 
entire sample, having this interest in common. Conversely, 
if we are working with a disinterested group, there will be a 
very slight return, with those people who do answer coming 
from an unusual group not at all representative of the whole 
group. If we are working with a group of unknown or mixed 
interest, we will probably find that our answers come from the 
interested portion and as such are not representative.’ 

We have an example of this in the ‘‘hot’’ and ‘‘cold”’ lists 
spoken of by people who deal with mail solicitation. In start- 
ing out with a ‘‘hot’’ list of people who have already proved 
their interest, the distinction is no longer the important one of 
an ‘‘interested’’ or ‘‘not interested’’ group, but simply of a 
‘‘more’’ or ‘‘less’’ interested group. A ‘‘cold’’ list consists 
of uninterested and untried individuals, and the returns them- 
selves contain an unknown factor which makes them unreli- 
able as far as the entire group is concerned. 


SUMMARY AND CONCLUSIONS 


(1) Two techniques were used which proved successful in 
securing complete representativeness of the basic sample: 

a) An original mail questionnaire, followed by a second 
mail questionnaire, and finally by a telephone survey of a 
random sample of those who had not replied. 

b) An original mail questionnaire, followed by three waves 
of follow-up questionnaires, the last wave of questionnaires 
simplified and shortened to contain only the key question. 

These procedures permitted a comparison of the original 
and the follow-up waves, enabling us to study those factors 
influencing questionnaire returns. 

7 In order to make use of the follow-up technique it is necessary to know 
the names of those people responding. In many studies it is advisable not 
to ask the respondents to sign the questionnaire. In such cases, some 
simple code must be worked out to identify the returns. One such pro- 
cedure is given in the paper following this one, Rollins, M., ‘‘ The Prac- 
tical Use of Repeated Questionnaire Waves.’’ 
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(2) Two main factors were found to influence the returns 
of mail questionnaires : 

a) Interest or familiarity with the topic under investiga- 
tion—the more integgst, the greater the returns. 

b) Education of the respondent—the better educated, the 
greater the returns. 

(3) The use of the follow-up technique decreased the bias 
in the answering portion of the sample, permitted an inference 
as to the direction in which the bias was operating, and in- 
creased the total response. This suggests that instead of 
increasing the original size of a sample in order to secure a 
response of a pre-determined size, the research worker send 
out a second and possibly a third request for a response to 
those who have not answered. 














THE PRACTICAL USE OF REPEATED 
QUESTIONNAIRE WAVES 


A Remark to the Preceding Article On ‘‘ Who 
Answers Questionnaires?”’ 


MALCOLM ROLLINS 
Promotion Director, Cosmopolitan Magazine 


N selling magazine advertising space, facts about the habits 
if of readers are of marked value. 

Recently we wished to know whether our readers 
traveled by regular commercial air lines. 

We planned a post card questionnaire, the larger portion 
containing a printed letter, to ‘‘Dear Reader,’’ with the 
printed signature of our publisher; the return portion being 
the size of a government card. 

We desired about 200 replies, coming from urban readers. 
Some of the cities chosen were on regular air routes, like Chi- 
eago and New York, while others, like Worcester, Mass., were 
some 50 miles from any scheduled stop. 

The names were not culled except to eliminate those with 
the prefix, ‘‘Miss.’’ Cards were sent to married women, men, 
and to those names with initials only, who, of course, may have 
been either men or women. 

The questions asked were simple—Have you flown on a 
regular commercial air line? If so, what features (safety, 
speed, meals, comfort, ete.) interested you? If not, what 
features (low cost, safety, speedy arrival, ete.) would you like 
to know more about? Does your husband (wife or children) 
fly? And your age, in 5-year brackets.’ 

1To these questions were added others about certain foods, but they 
have no bearing on the present problem. 
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Ordinarily the list would have numbered 1500. However, 
at the suggestion of the Office of Radio Research, we decided 
to use but 750 names, sending a second (duplicate) question- 
naire to those who failed to respond to the first. 

In this way, we anticipated two results: First, of course, a 
total response large enough for our purpose and, if possible, 
greater than we might have got from one attempt; and, 
second, a truer, or possibly more complete, picture of our 
readers’ flying habits. 

By simply giving each name a number and repeating this on 
the return card as a room number, it was easy to check off 
respondents as their cards came in. 

The belief that the second wave would bring a higher per- 
centage than the first one was not corroborated. Twenty- 
three per cent of the first questionnaire wave was returned, 
but only 13 per cent of the second. The second was an exact 
duplicate of the first except that in one corner, printed in red, 
appeared, ‘‘Please answer at once.’’ This unexpected result 
might be explained by the fact that we added a little stratagem 
to the procedure, which seemed to have been very effective. 
Attached to the letter portion of the questionnaire was a small 
pencil with the suggestion that after being used in answering, 
it be dropped in the respondent’s purse. 

This seems to have led to an unusually high return from the 
first wave and made the second wave relatively less effective. 
However, this is only a guess the correctness of which would 
have to be tested by being used with another sample which 
would be sent the same two questionnaire waves but without the 
pencil attached to the form. 

The other expectation which we had when sending out two 
questionnaire waves was well fulfilled. The second wave 
showed that the first return had over-rated the number of 
flyers among our readers. There were 17 per cent among the 
respondents to the first wave who had flown, but only 7 per 
cent of the respondents to the second wave. Obviously those 
who are more interested in flying are more likely to answer a 
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questionnaire about flying the first time. If the findings of 
the preceding article are correct, then the sharp drop in the 
number of flyers would have still another explanation. 
Wealthier and therefore more literate people are more likely 
to answer questionnaires, and as wealthier people would also 
be more likely to fly, we have both factors—interest and 
literacy—working toward a greater proportion of flyers 
responding to the first wave. 

Thus the use of a second wave helped to give us a truer pic- 
ture of the habits of all the addressees than we would have 
got from examining the returns to only one wave. 

In other respects the two waves of replies show reasonably 
close similarities. The age distributions of the two groups of 
respondents are very close. In both groups, lower cost was 
more important to younger people than to people over 55, 
while safety seems to be of much more interest to people over 
45 than to people under that age. Of course, both these reac- 
tions seem perfectly normal. 

The results obtained were sufficient for our needs. The 
method of double questioning a smaller list, rather than a 
single approach to a larger number, seems to offer some very 
practical advantage, and we will utilize it again. 











WHO ESCAPES THE PERSONAL 
INVESTIGATOR? 


HAZEL GAUDET 
Office of Radio Research, Columbia University 
AND 
E. C. WILSON 
Elmo Roper 


ETHODOLOGISTS have long been baffled by the diffi- 
M culties of ascertaining the characteristics of the indi- 
viduals who evade the dragnets of the most carefully 
planned personal interview surveys. By the very fact of their 
elusiveness they defy scrutiny, and devious methods must be 
resorted to in order to reveal their identity and the manner 
and extent to which they balk attempts at achieving repre- 
sentative sampling. 

An opportunity unique in the annals of personal interview 
studies to examine these individuals more thoroughly was 
presented as a by-product in the course of a study conducted 
for another purpose by the Office of Radio Research and Elmo 
Roper’s research organization.‘ In this study an original 
sample of approximately 2800 individuals, carefully selected 
as representative of a single county in Ohio, were interviewed 
in May, 1940, on their radio and reading habits, political 
opinions, and personal characteristics. During the course of 
the summer attempts were made to re-interview 1800 persons, 
selected as representative of the original sample. Because so 
much was already known about these individuals from the 
first interview, the second interviews presented an opportunity 
to examine the characteristics of the individuals who are dif- 


1 The latter acted for Fortune and Life magazines; the major results 
of this study will be presented in the January 1941 issue of Fortune and 
will later be incorporated into book form by the Office of Radio Research. 
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ficult or impossible to contact in the personal interview call- 
backs. It was hoped that this would throw some light on the 
usual reasons for non-cooperation in interview studies. It must 
be stressed, however, that the data herein presented are based 
on attempts at re-interviewing individuals about whom a great 
deal of information kad already been collected. There is no 
information available on the characteristics of the original 
refusals in this study. 

Out of 1794 cases on which contacts were attempted, 159 
or approximately nine per cent were not successful. This 
group divides itself naturally into two sub-groups, those who 
refused to be interviewed constituting three per cent and an 
unavailable group making up six per cent of the total. Every 
attempt was made to avoid losing any cases. A respondent 
was never counted as a refusal until two or three different 
interviewers had attempted to secure the interview. A case 
was never considered unavailable until several attempts had 
been made to contact the individual. Interviewers were 
trained to make every possible inquiry in the neighborhood 
in order to secure clues as to the best arrangements for find- 
ing the respondent. 

The comparison between the refusals and the unavailable 
cases turned out to be interesting in itself. In many instances 
the two groups seemed to tend in opposite directions from the 
cooperating group. A few examples shown in Table 1 will 
illustrate this trend. 

Table 1 shows that refusals tended to come most frequently 
from housewives and people of low educational status. The un- 
available group was predominantly male, included more indus- 
trial workers than the other groups, and tended toward lower 
economic levels. Refusals were characterized by individuals of 
intermediate ages, while more young individuals were unavail- 
able for interviewing. 

It can be seen that most of the personal characteristics listed 
in Table 1 show opposite trends from refusals to unavailable 
cases with the cooperating group falling in the middle. Edu- 














THE PERSONAL INVESTIGATOR 775 


TABLE 1 
Some Personal Characteristics on which Refusals, Cooperators and 
Unavailable Groups Differed* 





Personal characteristics Refused Cooperated Unavailable Total 





Female 63.2% 51.6% 30.4% 50.8% 
Low economic status 12.3 20.1 25.5 20.2 
Low educational level 43.2 47.0 44.0 
Industrial workers J 28.1 44,1 28.9 
Housewives 44.9 28.4 44,3 
20 to 24 years of aget ..... 5. 9.9 15.8 10.1 
25 to 44 years of age R 43.2 37.6 43.1 
Number of casest 1635 102 1794 








* All probabilities in this paper are estimated by a technique equivalent 
to the Chi-square calculation at the .05 level of significance. See Joseph 
Zubin, ‘‘ Nomographs for Determining the Significance of the Differences 
Between the Frequencies of Events in Two Contrasted Series or Groups,’’ 
Journal of the American Statistical Association, 34, 207, (September 
1939), 539-544. 

t No differences were found in the older age groups. 

+ 100% += total in each box. For instance, of the 57 refusals, 63.2% 
were female; of the 1635 cooperators, 20.1% were of low economic status. 


cational level did not fall into this customary pattern and 
presented the most difficulties of interpretation: It seems to 
differ significantly from refusals to cooperators but not to be 
a distinctive feature of the unavailable group. Apparently 
the heterogeneity of the latter group accounts for this com- 
plication. It was actually made up of three different classes: 
vacationists, persons who could not be located or accounted for 
at all, and a miscellaneous group unavailable because of ill- 
ness, death or moving away from the vicinity. As would be 
expected, the vacationists predominated in the upper eco- 
nomic, occupational and educational levels but were evenly 
divided as to sexes. The other two groups were predomi- 
nantly males. Those who could not be located at all tended 
to be young and predominantly industrial workers, while the 
miscellaneous group tended to be old. Both of those groups 
were dominated by individuals of low educational level. The 
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fact that the unavailable ones were composed of these two 
extremes was probably responsible for the findings on edu- 
cation. 

From these characteristics certain differences in political 
opinions automatically derive. It was found that the refusals 
tended to be disinterested in the election and have very few 
definite opinions as would be typical of a group dominated 
by women and individuals of low educational status. On the 
other hand, the unavailable group tended to be much more 
definite in their opinions than either the cooperators or the 
refusals. This is probably due to the fact that the unavail- 
able group is made up chiefly of men and of people who are 
more mobile, active and interested in current affairs. A few 
sample items on which significant statistical differences were 
found are shown in Table 2. 


TABLE 2 
Significant Differences in Political Opinion among Refusals, 
Cooperators and Unavailable Cases 





Political opinions Refused Cooperated Unavailable Total 





No interest in election ..... 28.1% 11.9% 59.8% 12.3% 
Don’t know how to vote 

in coming election ........ 32.0 24.8 15.7 24.3 
Don’t expect to vote ........ 23.3 10.9 5.9 11.0 


Don’t know which party 
will probably win elec- 








tion 56.2 35.3 19.6 35.0 
Number of cases” ...... 57 1635 102 1794 
* 100%. 


The real significance of these deviations of the unsuccessful 
contacts in any survey lies in the influence which their absence 
may have on the final results. Reference to the total columns 
at the right hand side of Tables 1 and 2 will show at a glance 
that the results of the cooperators and the results shown when 
all three columns are considered together are practically iden- 
tical in every instance. The reasons for these close similari- 
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ties are probably two-fold. First, the fact that refusals were 
made up predominantly of women and the unavailable group 
chiefly of men, meant that in numerous instances the two groups 
cancelled each other’s influence by following diametrically op- 
posed trends in both characteristics and political opinions and 
interests. This tendency, of course, may be a function of the 
subject-matter of the particular study which caused women to 
refuse interviews more frequently than men and to counteract 
the unavailable group tendencies more than a study which 
would be of greater interest to women. 

Secondly, and of primary importance, is the fact that the 
number of cases in the unsuccessful contacts is so small that 
their influence on the total trends is practically unnoticeable. 
This can be seen by comparing the second and fourth column 
in the two tables. As long as refusals constitute no more than 
three per cent and the unavailable cases six per cent of the 
total attempted interviews, and a sufficiently large sample is 
used, it is unlikely that the loss of the unsuccessful contacts 
will seriously bias the final results. 























A RIGID TECHNIQUE FOR MEASURING THE 
IMPRESSION VALUES OF SPECIFIC 
MAGAZINE ADVERTISEMENTS 


D. B. LUCAS 
New York University 


HE most widely used advertising audience measure, based 
upon a type of memory performance, is the recognition 
technique as developed by George Gallup. He was pub- 

lishing and studying newspapers at a time when questionnaire 
surveys by Scott? and Hotchkiss and Franken’ placed reader 
interest in editorials and financial news near the top. Sports 
and comics appeared to have little value in holding circula- 
tion, according to these surveys. Gallup found that when he 
took newspapers to the readers, and asked them to point out 
what they had read in those specific issues, a new set of facts 
appeared. People who would not admit the importance of 
Blondie and The Gumps in their lives, would readily admit 
that they ‘‘happened to read’’ those two strips on the morning 
of the interview. In other respects the results of these ‘‘rec- 
ognition’’ tests provided more plausible answers than had been 
obtained through questionnaires. 

In 1931 Liberty Magazine® published the results of a Gallup 
survey of editorial matter and advertisements in general maga- 
zines. Since that time two independent test services have been 
developed, making available regular recognition ratings of 
magazine advertisements for commercial use. The recognition 


1 Walter Dill Scott, Psychology of Advertising, 1921, pp. 383 ff. 

2G. B. Hotchkiss and R. B. Franken, Bulletin of the Bureau of Busi- 
ness Research, New York University, 1922. 

8 George Gallup, ‘‘Survey of Reader Interest in Saturday Evening Post, 
Liberty, Collier’s, Literary Digest.’’ 1931. (Liberty Research Depart- 
ment.) 
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technique is very simple but systematic care is given to the 
details of the interview. Readers of a current magazine issue 
are found by making door-to-door calls. The interviewer pro- 
vides a regular copy of that issue and asks the respondent to 
go through the magazine with him, pointing out the pages and 
details which he has previously seen. Conditions are made as 
favorable as possible for the easy recognition and identification 
of previously seen material. 

Recognition and other memory tests are usually applied to 
advertisements which have been seen under typical, normal 
conditions. As a measure of the size of a reading audience 
they depend, for their validity, upon honest and accurate 
identification of previously seen material. The aim of the in- 
vestigator is to establish evidence that the advertisement was 
seen by a specified number of readers, rather than merely to 
prove that they remembered it. Albert D. Freiberg,‘ in dis- 
cussing recognition tests before the American Association for 
Applied Psychology, pointed out that, ‘‘. . . under laboratory 
conditions we always know which materials have been shown 
and which have not.’’ It might be added that we know when 
the test items were seen, how long and how often. These are 
all unknown variables in the application of recognition tests 
to magazine advertisements. 

The differences between recognition test conditions for mag- 
azine advertis aents and the conditions for newspaper edi- 
torial features may account for faults in advertisement 
ratings. The newspaper interview may be made on the day of 
publication, whereas magazine interviews are delayed for 
several days or a week. Magazine advertisements are com- 
monly believed to be less interesting and to make less vivid 
impressions than do editorial features. Advertisements are 
more confusing since they are less distinctive, especially when 
running in campaign series and appearing in different publi- 

4 Albert D. Freiberg, ‘‘ Experimental Evaluation of the Starch and 
Gallup Type of Recognition Tests of Advertisements’’ (unpublished). 


A paper presented before the American Association for Applied Psy- 
chology, Washington, D. C., 1939. 
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cations. When a field worker calls at the home of a magazine 
reader and asks him to identify the advertisements which he 
remembers having seen, inflationary errors may arise in any 
of the following ways: 


1. The respondent may identify an advertisement falsely 
because of confusion with others which look like it. 

2. False identification may be deliberate. 

3. Familiar adjacent material and the sequence of elements 
may lead to false identification. 

4. The readers of some magazines may be more prone than 
others to inflation through false identification. 

5. Some interviewers may encourage a higher average per 
cent of identification than others. 

6. The interview itself may produce different identification 
percentages at different stages. 


It is obvious that even if the respondent is doing his best 
there are sources of error. And there is no way of insuring 
that he is doing his best. One investigator claimed to have 
eliminated the records of respondents who had a ‘‘far away 
look in their eye.’’ He did not elaborate this part of the tech- 
nique. Almost every investigator has tried to discourage false 
identification. Edward K. Strong,’ Hotchkiss and Franken,*® 
and Leonard W. Ferguson’ used advertisements more than one 
year old as controls. In tabulating their data, Strong and 
Ferguson assigned different weights to responses of varying 
degrees of positiveness. However, H. W. Rogers and H. C. 
Brown*® have shown that old advertisements may produce 
recognition scores as high as current copy. Another investi- 
gator stapled several unpublished advertisements into cur- 


5 Edward K. Strong, ‘‘ Attention-Value of Advertisements,’’ Research 
Bulletin (of Association of National Advertisers) No. 7, March 13, 1914. 

6G. B. Hotchkiss and R. B. Franken, ‘‘ Attention Value of Advertise- 
ments,’’ Bulletin of the Bureau of Business Research, New York Uni- 
versity, 1920. 

7 Leonard W. Ferguson, ‘‘ Preferred Positions of Advertisements in the 
Saturday Evening Post,’’ JOURNAL OF APPLIED PsycHOLOGy, 1934, 18, 
749-756. ; 

8H. W. Rogers and H. C. Brown, ‘‘Testing Copy Tests,’’ Advertising 
and Selling, October 22, 1936. 
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rent magazine issues as a means of testing the reliability of 
both the interviewers and the respondents. A recent report® 
describes a study in which ‘‘non-readers’’ of a magazine were 
interviewed to obtain recognition scores on the advertisements. 
When these ‘‘control’’ scores were compared with the regular 
recognition ratings secured from readers, it was found that 
non-readers rated nearly every advertisement more than one 
half as high as did the readers of the issue. 

Testing of the same advertisements both before and after 
publication has been carried on for more than five years at 
New York University,’® using four leading weekly magazines. 
A technique has been developed for controlling all of the 
stated sources of inflation of recognition scores. Interviewers 
carry scrapbooks" instead of whole magazine issues, but they 
confine the interviews to readers of the current issue of the test 
magazine. One half of the advertisements in each scrapbook 

. are taken from the current issue and the remainder are ob- 
tained from the next forthcoming issue. Since this same pro- 
cedure is followed from-week to week, the advance advertise- 
ments of one week become current in the scrapbook of the 
following week. This provides both a current and an advance 


® Advertising Research Foundation, Copy Testing, Ronald 'Press, 1939, 
pp. 46 ff. 

10D. B. Lucas, ‘‘The Impression Values of Fixed Advertising Loca- 
tions in the Saturday Evening Post,’’ JOURNAL OF APPLIED PSYCHOLOGY, 
1937, 21, 613-631. 

11 The essential details for making up the scrapbooks may be described 
briefly. Thirty test advertisements are selected from the current issue 
and 30 from the advance issue of a magazine, both sets being similar as 
to numbers of colored and large or small advertisements. All test pieces 
are removed from the publication, and are free of date lines, page num- 
bers and adjacent editorial matter. Each interviewer assigns 60 numbers 
to his exhibits in a different chance order, after which he mounts them 
in a scrapbook in the new serial order. The cover of the current maga- 
zine issue is placed on the outside of the book. The interview is based 
on door-to-door calls for the purpose of locating people who have read 
or looked into the current issue. They are then asked to go through the 
book and to point out those advertisements which, in their best judgment, 
they remember having seen. 
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score on each piece tested. Each recognition rating is then 
adjusted on the basis of the false identification produced by 
that particular advertisement. 























Raw Scores for Recognition itn Scrap Book 
Based on “Best Judgnent" 











Fie. 1. Diagram showing theoretical basis of recognition scores for 
magazine advertisements. 


Figure 1 is intended to show a theoretical basis for the con- 
trol of recognition scores. The wide horizontal band repre- 
sents all of the people who claim to have read or looked into 
the magazine previous to the interview. The area on the right 
represents those who actually saw a particular test advertise- 
ment. The upper line indicates the kind of score which might 
be expected in the regular recognition interview where people 
are asked to identify those pages of which they are sure. This 
procedure appears to produce some false answers as a result 
of honest confusion, loose guessing and plain lying. The low- 
est line represents the current score for the same advertise- 
ment when tested in a scrapbook. Because respondents are 
asked merely to use their ‘‘best judgment’’ in identifying ad- 
vertisements, this line may extend farther to the right than if 
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they were asked to be ‘‘sure’’ of identifications. The con- 
trolled recognition score is also indicated, and it is based on the 
difference between the pre-test score and the current score for 
the same advertisement. 

Figure 2 shows an advertisement which has been rated by 
the controlled recognition method. The adjusted (or con- 
trolled) score is not the simple difference between the current 
score of 40 and the advance score of 10. The simple difference 
of 30 per cent is increased to 33 per cent on the assumption 
that the confused people in the pre-test do read magazines in 
about the same proportion as other people. 

Several formulas have been advanced for making this ad- 
justment, each derived from a different logical assumption. 
So far all of these formulas can be equated. The simplest 
procedure is to compute the difference score as a percentage of 
the reliable respondents in the pre-test; in this case, 30 as a 
percentage of 90 gives an adjusted score of 334. The percent- 
age of confusion is small on the advertisement for Pittsburgh 
Plate Glass, and the controlled recognition score is close to the 
two commercial ratings. 

The Log Cabin Syrup advertisement in Figure 3 has a much 
higher percentage of false identification for reasons which may 
be obvious. (This is a relatively familiar campaign series. ) 
While the uncontrolled recognition scores obtained in the three 
tests are much alike, the adjusted (or controlled) recognition 
seore is significantly lower than the two commercial ratings. 

Table 1 shows a series of product-moment correlations 
(Pearson r) between commercial recognition scores and those 
obtained with a scrapbook. The unadjusted current scores for 
three consecutive issues of the Saturday Evening Post give 
coefficients. which vary around +.80. There is a significant 
falling off of the coefficients to approximately + .60 when final 
scores are compared. This would indicate that the amount of 
correction required for different advertisements varies con- 
siderably, so that high or low raw recognition scores do not 
insure proportionately high or low controlled scores.' In 
actual application, the correction method may not only equal- 
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TABLE 1 


Correlations between Ordinary Recognition Scores and Controlled 
Recognition Scores for the Same Magazine 
Advertisements12 


Correlation Coefficients (Pearson r) 
RECOGNITION SCORES 

—by Leading Commercial Recognition Service 

—and by Controlled Recognition Method 








‘ amas at: ‘ os te 7° 
Saturday te Commercial Ratings Compared With 











Evening Post of Test 3 a 
: “sag Uncorrected Corrected 
Date (1939) Exhibits Recognition (Controlled) Seores 
[ r P.E. r P.E. 
March 18 |‘ 23 | +.85. +.04 +66 +.08 
a 20 +.80 +.05 +.56 +.10 
April 1 | 20 +.75 +.07 | +.61 +.09 
Combined | 
Three 23 Full Pages in Color +.38 +.12 
Issues 





ize the ratings of different advertisements, but may even re- 
verse the relative recognition scores. 

This method of correcting recognition scores through ad- 
vance testing offsets inflation from all of the expected sources. 
It also equalizes other variables introduced when ordinary pro- 
cedures are applied to different magazines. Following are the 
theoretical and demonstrated answers to the six criticisms 
commonly directed towards recognition testing as ordinarily 
applied to magazine advertisements. 


‘(Points 1 & 2) Control over confusion and lying is established 
by testing the same advertisements under comparable condi- 
tions both before and after their regular publication. 

12 There may be a spurious element in correlating a series of advertise- 
ments running from black half-pages up to the more expensive colored 
pages. When the correlation is computed exclusively on the basis of 
colored pages in the three combined issues, it falls to +.38 as shown on 
the bottom line. 
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(Point 3) The removal of adjacent material and chance ar- 
rangement of the advertisements in the scrapbooks are in- 


tended to eliminate any prompting provided by the original 
setting. 


(Point 4) When the controlled method was first applied to 
different magazines it became apparent that the average 
amount of false identification was not a constant. One of the 
weekly magazines, well known for high recognition ratings, 
established a raw recognition average of 43 per cent as com- 
pared with an average of 38 per cent for three leading com- 
petitors. 


But the false identification for this same magazine averaged 
21 per cent as compared with an average of 14 per cent for 
the competitors. When the corrected scores are compared they 
show no difference between the one magazine and the average 
of its competitors, both scores being 28 per cent. It would ap- 
pear that one of these weeklies has been enjoying a spurious 


competitive advantage through greater average false identifi- 
cation by its readers. 


(Point 5) The variations in recognition scores produced by 
different interviewers have long been a problem. Field work- 
ers in most organizations are carefully selected and trained, 
and then they work under constant supervision. Any inter- 
viewer whose work does not fall into a typical pattern is 
ultimately dropped. The correctness of this pattern is estab- 
lished only by the fact that it is typical. 


Under the conditions of controlled testing it is not so serious 
when an interviewer produces a higher or lower than average 
percentage of ‘‘yes’’ responses, so long as the difference be- 
tween pre-test and current ratings remains about the same. 
Extremes in positive or negative answers automatically reduce 
the margin of difference scores. It became necessary to drop 
from the controlled recognition interviews one young lady who 
persisted in getting over 90 per cent ‘‘yes’’ answers. 


(Point 6) It has been considered likely that the serial location 
of an advertisement in the interview might affect the recogni- 
tion rating. That is the reason why each advertisement is 
placed in a different chance location in each interview kit. 
This arrangement not only compensates for variable location 
in the interview, but also provides excellent opportunity to 
study the influence of location. 
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The data represented in Figure 4 are taken from 1,557 
interviews on four weekly magazines. All of the scrapbooks 
contained 60 exhibits, and the chance arrangements are all 
different. Not only did the same advertisement appear in dif- 
ferent parts of the book, but so did all types and sizes of 
advertisements. It may be assumed that the variations in 
average scores for different locations are clearly a function of 
the effect of location and nothing else. 

The lower line of corrected scores shows a distinct low point 
just after the middle of the interview. The upper line of un- 
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VARIATIONS IN CONTROLLED RECOGNITION SCORES 
throughout an interview 


requiring 60 responses 














Fig. 4. Showing the effect of location in the interview kit upon the 
recognition score for an advertisement. 


adjusted scores maintains a generally parallel course, although 
the margin of false identification may be significantly im- 
proved at the close. Scores obtained at different stages in the 
interview vary more than 5 per cent. This seems to justify 
the use of varied chance locations or some other equalizing 
device. 

In closing this discussion of the proposed method of control- 
ling recognition scores on advertisements, we must admit the 
possibility of eliminating certain procedures on the basis of 
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short-cuts or proof that they are unimportant. Life magazine 
has just published a report'* showing the simplification of con- 
trols as applied to the identification of whole magazines as 
units. The study of magazine circulation employs a control 
procedure based on the same principle as that applied to ad- 
vertisements. It has been discovered that most of the people 
who falsely claim to have read a magazine are basing their 
judgment on a small number of pages. By eliminating all 
identifications based on less than a minimum number of pages 
it is possible to simplify the computation of reliable break- 
downs of the controlled scores. While it is probable that the 
identification of a single advertisement is much more difficult 
than that of a magazine issue as a whole, it is entirely possible 
that reliable short-cuts will be worked out for tabulating, and 
even for controlling the inflation of recognition scores of 
magazine advertisements. 


APPENDIX 1 


Numerous experiments with variations in the controlled 
recognition technique throw some added light on controversial 
points. Some observers have insisted that the respondents 
should be asked to identify only those advertisements of which 
they are sure. Others have suggested that respondents should 
be advised of the presence of advance material so as to put 
them on their guard against careless ‘‘yes’’ answers. 

One set of interviews was made in which every second re- 
spondent was warned of the presence of some advance material 
which he could not have seen before. There was a significant 
falling off of the ‘‘yes’’ responses for those forewarned people. 
The average positive response on current advertisements (half 
page and larger) fell from 36.9 to 30.0 per cent. The advance 
scores fell from 16.7 to 15.1 per cent. Despite the fact that 
both eurrent and advance scores were lower for the informed 
respondents, the difference or adjusted scores also dropped 
from 24.3 per cent to 17.6 per cent on the average (calculated 
by formula). 

Just why the forewarned people produced lower corrected 
recognition scores is difficult to prove. The respondent may 
reason that he would be foolish to identify an advertisement 
unless he is sure of having seen it in the magazine. On the 

18 Life’s Continuing Study of Magazine Audiences, Report No. 4, 
September 1, 1940. 
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other hand, it is always relatively safe for him to say ‘‘no.’’ 
The writer feels that the higher scores produced when respon- 
dents are not informed are equally valid. The objective of 
this technique is the nwmber of measurable impressions rather 
than the vividness of those impressions. 


APPENDIX 2 


Observers of controlled recognition testing have asked 
whether interviewers might tend to give cues to the respon- 
dents as to which advertisements are current. A casual check 
of this point was obtained by asking the field workers a num- 
ber of indirect questions. The response was reassuring, as 
interviewers find themselves too busy recording the reader’s 
responses to do much coaching. Furthermore, it is possible to 
keep the interviewers generally confused as to which are cur- 
rent and which are advance advertisements. 

Numerical evidence on the question of coaching of respon- 
dents was developed in an extensive series of interviews using 
two sampling methods for arriving at the same measure. In 
one procedure, readers of weekly magazines were interviewed 
to obtain average recognition scores on any one advertisement 
(half page and larger) appearing in any one of four maga- 
zines. These magazine scores were then reduced to represent 
percentages of the total population, ten years of age and older. 
In the other procedure, the entire population of the same area 
was sampled so that scores would apply directly to that 
population. 

Since there are many times more people who do not read a 
particular magazine than who do read it, the scores for the 
first procedure were reduced in inverse ratio. In the second 
method, where the scores were obtained directly, there was no 
such dilution of the possible effects of giving cues during the 
interview. If such cues are responsible for a significant part 
of the recognition score, then it might be expected that the 
scores in the second procedure would be higher than the pro- 
jected scores of the first, owing to the much greater proportion- 
ate number of interviews. Actually the average score for a 
single advertisement in any one weekly magazine by the first 
procedure was 27.5 per cent of the readers, or 4.7 per cent 
when projected to the population for the test area. The score 
obtained by the direct method was 4.8 per cent of that same 
population. It hardly seems possible that these two pro- 
cedures could lead to such similar results if the interviewers 
were guilty of providing extraneous cues. 




















AN EXAMINATION OF THE EFFECT OF NUM- 
BER OF ADVERTISEMENTS IN A MAGA- 
ZINE UPON THE ‘“‘VISIBILITY’’ OF 
THESE ADVERTISEMENTS 


RAYMOND FRANZEN 
Consultant in Market Research 


PROBLEM AND CONDITIONS OF EXPERIMENT 


T is important in any study of communications to know 
what people have actually read or listened to. There are 
essentially three methods of getting pertinent information: 

current records kept of people’s reading and listening be- 
havior, people’s recollections of what they have done, and data 
gathered by some kind of recognition method consisting of 
giving people lists of material which has been previously circu- 
lated and asking them whether they have seen or heard it. 

Each method has its merits and disadvantages. The recog- 
nition method has been most widely used because it is not tech- 
nically so difficult as the keeping of current records and it is 
not so subject to all the memory fallacies which creep into data 
based on people’s free recollections. For many communica- 
tions studies to come the recognition method will have to be 
used and therefore a more detailed knowledge of its advan- 
tages and possible fallacies implied is important. 

The field where most experience is available is the applica- 
tion of the recognition method to advertising : innumerable are 
the studies in which people are shown magazines or newspapers 
and asked whether they have seen a particular item. Because 
it was an early attempt at objective measurement, and because 
its results could be simply stated, the method developed 
rapidly and was generally accepted uncritically. 

Recent questions have been raised as to what the ‘‘visibil- 
ity’’ percentages obtained by any one of the three or four na- 


1 The per cent of respondents exposed to an advertisement who, by one 
eriterion or another, are counted as having ‘‘seen’’ the advertisement. 
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tional services mean and whether they need qualification in 
their interpretation and application. These questions include 
doubt as to the extent to which any visibility percentage is a 
measure of an advertisement’s effectiveness in terms of market 
behavior. Some of these questions are concerned with the in- 
terview threshold (responsiveness to a questionnaire) of dif- 
ferent classes of readers. Others ask whether these methods 
of obtaining advertisement ratings can yield comparable data 
when applied to different magazines. These latter questions 
reflect on the use of visibility figures for media evaluation (but 
also raise doubts as to the validity of the method itself) and it 
is these with which the present experiment is concerned. 

The present paper selects a special problem. What is the 
effect of the amount of advertising within a magazine upon the 
visibility percentages obtained for that magazine. The hypoth- 
esis is that certain psychological factors, both on the part of 
the investigator and the respondent, (which we shall call 
‘*fatigue’’ factors) will affect the visibility percentages of the 
magazine in relation to the bulk of advertising material to be 
covered in the interview. 

Two magazines in the same field, which we shall refer to as 
G and M, both with a large volume of advertising, were se- 
lected as subjects of the experiment. Their March 1940 issues 
were used. A standard interviewing procedure for copy-test- 
ing was adopted. Six hundred interviews of two different 
kinds were made for each magazine. One-half were obtained 
from respondents who were taken through the entire magazine, 
looking at each advertisement that was one-half page or larger. 
These whole-magazine interviews are directly comparable to 
. those usually obtained.? The other half were obtained from 
respondents who were shown only a part of the advertisements 
in each magazine. Magazine G, with 96 advertisements, was 
divided into four sections, and magazine M, with 71 advertise- 
ments, was divided into three sections. These sections in each 
magazine then contained an equal amount of advertising. In 


2A discussion of this comparability and of the reliability of the experi- 
mental results appears at the close of the article. 
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these split-issue interviews, one woman was interviewed on one 
section only. The results of these sectional interviews are free 
of any effect which is due to the number of advertisements that 
must be covered in any one interview. They were, therefore, 
used as a control against which to measure the results of the 
method as usually applied. 

The field work was divided equally and identically between 
Louisville and Indianapolis, with a specially instructed super- 
visor in each city. 

The mere effect of ‘‘fatigue’’ is evident even without the 
experimental setup to be reported here. The visibility per- 
centages of advertisements examined early in the interviews 
are always higher than those exhibited in the later part of the 
interview. The two-column advertisements have an average 
visibility 9 per cent higher when they are presented first for 
inspection than when they are presented last. The average 

8 This result was gained in the present experiment in going through the 
magazines for half of the cases from the front to the back of the issue, 
and from the back to fhe front for the other half of the cases, so that the 
advertisements would half the time be treated early in the interview and 
half the time treated late in the interview. There is, however, a numerical 
adjustment necessary. The visibility percentages from ‘‘back to front’’ 
interviews tend to be consistently lower than those obtained when adver- 
tisements were presented in the usual order. This is apparent in the 
following averages: 





Type of advertisement 





4-color B&W 2-color B&W 
full page fullpage 2-column 2-column 





% % % % 
Magazine G 
**Pront-to-back?? oo... 60 38 33 36 
§*Back-to-fromt?? occ 59 32 30 33 
Magazine M 
**Pront-to-back?? ccm 68 62 38 35 
S$ Back-to-fromt 7? nnncccssoonws 64 54 35 34 





There is, evidently, a slight psychological balking at the unnatural order 
of examination, a fact which must suggest the susceptibility of the method 
to uncontrollable subjective factors. 
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visibility of the same black and white full-page advertise- 
ments, in either half of the magazines, is 5 per cent higher 
when they are presented in the first half, instead of in the last 
half of the interview. The four-color full pages do not show 
the same effects of ‘‘fatigue.’’ 

Using the two orders of advertisement presentation will, of 
course, help to prevent penalizing an advertisement’s visibil- 
ity because it appears late in the interview. Nevertheless, this 
cannot overcome the effects of more or less ‘‘fatigue’’ asso- 
ciated with magazines requiring a more or less arduous inter- 
view. If ‘‘fatigue’’ occurs toward the end of the interview, 
the longer the interview is the greater the ‘‘fatigue’’ is likely 
to be. This brings us to the primary question: Does this 
method of advertisement evaluation yield comparable results 
for magazines with different advertisement characteristics, 
particularly the commonplace difference of quantity of adver- 
tising ? 


THE EFFECT OF QUANTITY OF MATERIALS ON ‘‘VISIBILITY’’ 
RATINGS 


Both magazines suffer in their rating for advertising visi- 
bility when the whole magazine is used for the interview, as 
views, where psychological functions of bulk of material are 
eliminated. The larger magazine, G, suffers more than the 
smaller, and the types of advertisement suffer in direct rela- 
each magazine by the two methods. 

The per cent of visibility loss occasioned by the longer inter- 
view for each advertisement type is summarized in table la 
(black and white full-pages are omitted because of the unre- 
liability of the M magazine figures). 

The effect of the ‘‘fatigue’’ factor of the long interview is 
clearly apparent. Magazine G, with one and a third times as 
much advertising, suffers more than Magazine M. In both 
magazines the smaller the advertisement, and the less color 
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TABLE 1 


Average Visibility for Each Type of Advertisement from Split-Issue 
and Whole-Magazine Interviews 





4-color B&W 2-color B&W 
full page full page 2-column 2-column 





Magazine G % % % 
Split-issue 47 42 50 
Whole magazine 35 32 34 

Magazine M 
Split-issue 64* 45 44 
Whole magazine 59* 36 35 





* Unreliable—only 5 B & W full-page advertisements in Magazine M. 


TABLE la 
Visibility Loss as Inferred from Table 1 





4-color 2-color 
full page 2-column 


%o % 
Magazine G 9 24 32 
Magazine M 7 20 20 














(i.e., advertisements easier to neglect in the long interview) 
the more the per cent of visibility is reduced when the long in- 
terview is used. Black and White two-column advertisements 
are most neglected in whole magazine interviews, as may be 
seen by the large reduction in visibility, compared with less 
than ten per cent reduction in the four-color full-page,—a type 
that is less frequent, more attention-holding, and quickly iden- 


sonable when one considers the psychological hazards of going 
through a detailed questionnaire routine for as many as sev- 
enty or ninety advertisements. 

Chart I shows the per cent of visibility loss experienced by 
each advertisement when its whole-magazine visibility is 
ratioed to its split-issue percentage. There is a definite ‘‘fa- 
tigue’’ pattern, particularly in magazine G. The ‘“‘scallops’’ 
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of visibility loss throughout the magazine are ‘‘scallops’’ of 
diminishing attention and response by the investigator or re- 
spondent, when exposed to the annoyance or tedium of a long 
interview. There is small visibility loss by the most conspicu- 
ous advertisements. The four-color full pages punctuate the 
pattern. Interest lags as a long series of small and colorless ad- 
vertisements are presented for questioning. The longer the 
series of the two-column and black and white, the greater is 
their neglect under the trying conditions of long and monoto- 
nous attention. 

The ‘‘fatigue scallops’’ are less obvious in the case of the 
smaller magazine M, but the tendency toward increasing loss 
of reported visibility, when there is a succession of the less 
colorful advertisements, is present. The stimulus given to the 
attention of the interview participants by a color page, or the 
turning of an editorial section is apparent here, too. 

It would seem that the results of the copy-testing interviews 
(the visibility percentages) are subject to the interest or strain 
of the interview and must, therefore, differ for different maga- 
zines. This is shown by the different fatigue patterns of visi- 
bility loss in the two magazines. The same investigators, under 
the same supervisory control, made both the long and the short 
interviews on both magazines, and yet the visibilities obtained 
decrease in exact relation to the arduousness of the interview. 

The effect of the larger quantity of advertising on the visi- 
bility accorded a magazine is interestingly presented by a com- 
parison of the twenty-three advertisements appearing iden- 
tically in both the larger and smaller magazine. On the next 
page are the visibility percentages obtained in the whole-maga- 
zine interviews. 

When measured in this way, seven of these advertisements 
had a reliably higher visibility in magazine M than in G, and 
six advertisements were higher in G. The rest were about 
equal (difference of two points or less). 

A very different estimate of the two magazines, as advertis- 
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Visibility in: 
Magazine G Magazine M 
%o % 
4-color full page: A 74 73 
B. Gg W full page: A 57 57 
B 59 71 
Cc 47 45 
2-color 2-column: A 36 35 
B 50 47 
Cc 46 45 
D 59 43 
E 42 46 
B&W2column: A 46 46 
B 39 41 
Cc 64 65 
D 27 30 
E 31 34 
F 50 59 
G 48 53 
H 36 42 
I 38 34 
J 29 21 
K 28 29 
L 41 39 
M 42 37 
N 41 32 





views of equal length for both magazines (split-issue per- 
centages). (Figures on the following page.) 

Now, without having its visibility measures handicapped 
because of its size, the larger magazine is found to have higher 
visibility on fifteen advertisements, instead of six. The smaller 
magazine has lost its advantage due to size and instead of 
higher visibility for seven advertisements, it has superiority 
on one only. 


CONSISTENCY AND RELIABILITY OF THE RESULTS WITHIN THEM- 
SELVES AND WHEN COMPARED WITH THE RESULTS OF A 
COPY-TESTING SERVICE 


In order to study an accepted procedure, the interviewing 
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Visibility in: 





Magazine G Magazine M 





| % % 
4-color full page: A 84 73 
Bg W full page: A 62 64 
B 70 70 

Cc 62 54 

2-color 2-column: A 48 49 
B 60 58 

Cc 56 48 

D 63 51 

E 54 55 

Bg W 2-column: A 63 49 
B 63 62 

Cc 75 65 

D 45 34 

E 49 45 

F 57 69 

G 57 57 

H 58 51 

I 45 34 

J 44 34 

K 50 42 

L 68 48 

M 66 48 

N 55 47 





method, criteria and specifications adopted were those of a 
national service with widespread acceptance. The visibility 
figures this Service reports define an advertisement as seen if 
the respondent, upon being shown the original advertisement, 
recalls having previously seen it in the given magazine and at 
the time of previous notice associated it correctly with the 
brand name of the product, or the company’s name. It is, 
of course, impossible to prove that the interviewing method 
was identical with that used by the Service itself. In fact, our 
experience with the method showed such wide and persistent 
differences between interviewers that we are convinced that 
standard procedure can only be reached through constant 
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supervision. Every effort was made, however, to follow in- 
structions as given to interviewers by the Service and to repro- 
duce interviews of their investigators as actually observed. 
The results show a consistency that is surprising in view of 
the unobjectivity of the method. 
1. The visibility averages for Louisville and Indianapolis 
are practically identical : 





Type of Advertisement 


4-color B&W 2-color B&W 
full page full page 2-column 2-column 
Lou. Ind. Lou. Ind. Lou. Ind. Lou. Ind. 





% To % To %o %o % %o 
Magazine G 66 66 46 47 43 40 50 51 
Magazine M 71 71 66 63 46 45 45 44 





2. There is very high correlation between the visibility ob- 
tained for each advertisement by split-issue interviews, by 
whole-magazine interviews, and by the Service. The Pearson 
r’s for these relationships are: 


Magazine G: 
Whole-magazine percentages with split-issue percentages ............ 97 
Whole-magazine percentages with Service percentages ................... 93 
Magazine M: 
Whole-magazine percentages with split-issue percentages ......... 91 
Whole-magazine percentages with Service percentages ................. 90 


These correlations are particularly interesting when we note 
that the Service results were based on national distribution, 
while the experimental results were confined to two cities. 

The experimental whole-magazine percentages are consis- 
tently higher than those of the Service, as is apparent in the 
average visibilities by advertisement type (Table 2). 

It may be argued from this average difference that there was 
a difference in the criterion used to identify a seen advertise- 
ment. It seems much more likely, however, that this difference 
reflects the difference between field work under constant resi- 
dent supervision, and that carried on by isolated investigators, 
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TABLE 2 


Average Visibility for Each Type of Advertisement from Whole Magazine 
Interviews and as Reported by the Service 





4-color B&W 2-color B&W 
full page full page 2-column 2-column 





Magazine G 
Whole magazine 60 35 32 34 
Service 48 28 17 21 

Magazine M 
Whole magazine .. 66 59* 36 35 
55 46* 21 24 





* Unreliable—only 5 B & W full page advertisements in magazine M 


where daily supervision is impossible and personal supervision 
is but periodic. A real difference in criterion, as contrasted 
with differing rigor of application, would not yield such high 
correlation between the two sets of visibility figures. 

The whole magazine and the Service visibilities are alike ex- 
cept in the degree to-which they fall below the split-issue visi- 
bilities. 


not only subject to the arduousness of the interview, but also 
to the amount of effort expended by the investigators, as shown 
by the highly correlated but lower percentages obtained by 
isolated investigators under less stringent supervision. 





THE USE OF MAIL QUESTIONNAIRES TO 
ASCERTAIN THE RELATIVE POPULAR- 
ITY OF NETWORK STATIONS IN 
FAMILY LISTENING SURVEYS 


PAUL F. LAZARSFELD 


Columbia University 


F one wants to study the popularity of stations he will first 
have to agree on a definition of popularity. The most 
obvious definition would be in terms of the average amount 

of time the people of a certain area listen to a station, say 
over a period of a week. The station to which more listening 
time is devoted would then be more popular. 

Speaking with people in the radio industry, one will discover 
that they do not always agree with this approach. The argu- 
ment against a definition of popularity in terms of time runs 
about this way. The amount of time people spend listening to 
a certain station is likely to be determined by four factors: the 
physical signal strength of the station, its present program 
structure relative to competing programs, its past program 
performance relative to competing programs, and the specific 
‘*eolor’’ of the station, which might consist of especially popu- 
lar announcers or a reputation of reliability, or particular 
speed of reporting, ete. Some people feel that it is this fourth 
-faetor which should be meant when station popularity is dis- 
cussed, or a combination of this fourth factor and the residues 
of past program policy which together might be called the 
‘*station halo.’’ 

Although for the direction of future research it will be 
decisive to agree on an adequate definition of station popular- 
ity, the question is not of great importance for one who is deal- 
ing with data available so far. The extensive surveys made 

802 
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during the last ten years have not covered all those problems 
but have restricted themselves to asking people any one of three 
questions: Which stations they listen to at all; which stations 
they listen to regularly; and which stations they listen to 
most. 

In this way not only do we not know what we mean by 
station popularity, but we also do not even know what our 
returns mean in terms of actual behavior of people. Two 
persons, both of whom claim to listen to a station regularly, 
might mean rather different things in terms of listening time 
or attitude toward the station. 

To these two problems we have to add a third as soon as we 
start actually to collect information. The majority of the large 
surveys done by the networks are based on mailed question- 
naires, and doubt can be raised as to such procedures. As a 
matter of interest, some five or six years ago there was organ- 
ized a Joint Committee on Radio Research which, in turn, ap- 
pointed a technical committee to investigate the best procedures 
for ascertaining ‘‘station listening areas.’’ The final report 
of this technical committee was based on elaborate experiments 
with the following ‘‘basic question’’: 


Please name the radio stations you and your family 
regularly use for actual listening to programs. 


Most of the studies made since that time included this basic 
question, preserving the wording, ‘‘regular listening’’ and the 
stress on family rather than on individual listening. They 
deviated, however, from the advice of the technical committee 
inasmuch as some studies used mailed questionnaires rather 
than personal interviews which were strongly suggested in the 
committee’s report. 

From the large range of problems which could be derived 
from our introductory remarks, the present paper selects just 
one: the comparative merits of personal interviews or mailed 
questionnaires, in surveys where the ‘‘ basic question’’ is being 
used. This problem has two sides: 
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1. One aspect of the question might be called the 
problem of individual validity. It depends upon the 
psychological nature of a question whether we would 
get more truthful or reliable answers if we asked it in a 
personal interview rather than in a written question- 
naire. For instance, it is likely that a person would 
rather tell the truth about his yearly income in a writ- 
ten report than in personal contact with an interviewer. 
There is probably a similar effect in the case of all kinds 
of embarrassing questions. On the other hand, if we 
are interested in all the details of a specific experience, 
the personal interview will be more successful than the 
written inquiry. Whether asking a person for the sta- 
tion ‘‘ which you and your family regularly use’’ would 
be better in face-to-face contact or in writing is what is 
meant in this paper by the term, ‘‘individual validity.’’ 

2. Clearly distingiushed from this problem is the 
question of sampling and the bias which might be in- 
duced by using mailed questionnaires. We shall talk of 
statistical validity when referring to the sampling 
problem. 


It is important to realize that these two questions are not 
necessarily related. For example, it might be that we can get 
better information about a single family by corresponding 
with them rather than talking with them; and yet the people 
who answer our questionnaires might not be representative of 
the whole group we are interested in and therefore mailed 
questionnaire returns might be statistically invalid even if the 
mailed inquiry has a greater individual validity. On the other 
hand, it might be that we do not have a sampling error but the 
mailed returns are individually invalid: for instance, if the 
postcards are filled in by a lot of children who make a joke of 
returning them. Therefore, we must discuss individual and 
statistical validity separately. 


THE INDIVIDUAL VALIDITY OF MAILED REPORTS ON 
FAMILY LISTENING 


The notion of family listening might sound somewhat awk- 
ward, as if one were attributing to the family a personal en- 
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tity. As a matter of fact, as radio research overcomes its 
erude beginning, it is very likely that more and more stress 
will be laid upon individual listening. Still, family listening 
can be put into sensible, operational terms: it is the amount 
of time during which the family radio is tuned to different 
stations. We can only speculate as to whether a personal in- 
terview or a mailed questionnaire is psychologically preferable 
for getting valid information on the listening habits of a 
family as a whole. The two methods seem to be different in 
at least three respects. 


Mailed Questionnaire Personal Interview 
(a) The mailed questionnaire is (a) The interviewer is more likely 
relatively more likely to be to be received by the wife. 
answered by the man in the 
family. 
Data collected from several surveys seem to indicate that about 80 per 


cent of door-to-door interviews but only 50 per cent of mailed question- 
naires are answered by women.3 


(b) The mailed questionnaire if it (b) The personal interview is obvi- 
refers to family listening is ously answered by one person 
likely to be answered in consul- for the whole family. 
tation with several family mem- 
bers. 


In the follow-up study mentioned in Footnote 3 the respondent was asked 
how the person who had filled in the card determined what the rest of the 
family listened to. The procedures followed by the original respondent 
could be classified in the following way: 


2It would make a tempting object of speculation that here we have a 
concept the meaning of which is altered by technological changes; the 
more multiple ownership of radios spreads, the less useful will the concept 
of family listening become. 

3 The latter figure was arrived at by following up by personal interviews 
more than 600 returned postcards. Incidentally, only 5 per cent of this 
special group of postcards were filled in by children below 15 years of age. 
Unfortunately, the postcards followed up were taken from a larger number 
which contained also many unsigned cards; this possible source of bias does 
not permit a numerical generalization, a shortcoming which should also be 
kept in mind in reading the following Table 1. 
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TABLE 1 
How People who Filled in a Postcard Ascertained the Regular Listening 
Habits of Other Family Members 
Per cent of cases 
How postcard was filled in where family listen- 
ing was included 
Showed card, implied asking 13 
Didn’t show card, but asked 1 
Can ’t tell if showed card, but asked 00... ..ccccccocnn 19 
Showed card, did not ask 1 
Consulted some but not all of familly... ccccccocnene 12 
Consulted about day but not night listening ......... 2 
Did not consult but assumed he knew 0. cnnnn 50 
Unknown 2 
Total per cent 100 
Total cases 456 








(¢) The mailed questionnaire ispre- (¢) The personal interview is an- 
sumably answered after some swered on the spur of the 
deliberation. moment. 

What do these differences mean for the validity of the an- 
swers? Difference (a) points to the question of whether men 
or women know more about the listening habits of the family. 
One might argue that the housewife who is at home knows 
more, but on the other hand, the man might be better trained 
to make the estimate needed. As a result, for instance, the 
wife as an informant might be preferable for daytime listen- 
ing, and the husband for evening hours. 

Difference (b) is probably an advantage of the mailed ques- 
tionnaire because as long as family listening is under investiga- 
tion, the family in consulation will know more about it than a 
single member will. 

Difference (c) again is a matter of psychological conjecture. 
If the respondent has time to think a matter throuzh, his an- 
swer should be a more valid one provided he does not sit down 
and just copy from the newspaper stations to which no one in 
the family listens. The answer given to the personal inter- 














RADIO STATIONS IN FAMILY LISTENING 807 


viewer on the spur of the moment might be greatly influenced 
by the experience of the last few hours and therefore might 
not reflect an over-all situation, not even for the individual 
respondent. (These variations, however, might cancel out 
over a great number of interviews taken at different times of 
the day.) 

It is important to see clearly the kind of test of validity re- 
quired to decide the question just discussed. A respondent’s 
reply is valid if his statement on family listening reflects satis- 
factorily what all the family members together would have 
said if each had been asked separately what stations he listens 
to regularly. An example of such a test is on hand. 

In the spring of 1940 the National Broadcasting Com- 
pany conducted an extensive listener survey, sending out more 
than a million postcards, and about 150,000 families reported 
their listening habits. In the course of this survey, the Office 
of Radio Research made a special follow-up study ;* 669 signed 
posteards were picked out at random and every member of each 
of these families was asked the question used on the NBC post- 
card: ‘‘ What radio stations do you listen to regularly in the 
evening? (In the daytime?)’’ To obtain an over-all picture 
of the family, we tabulated once—and once only—each station 
that was mentioned. That is, if several members of a family 
mentioned the same station, it was tabulated just as if one per- 
son alone had mentioned it. In this way only could we com- 
pare personal interview with mailed returns where people 
had no chance to give special weight to such stations which 
were listened to by more than one family member. Upon a 
first glance at the data one might feel rather discouraged. 
Only 44 per cent of the urban and 15 per cent of the rural cases 
correspond completely when recorded according to the two 
methods. For all the other families, the postcard gives either 
more or less or somewhat different stations compared with the 
cumulative report of all the family members. But these dif- 
ferences seem due to minor memory variation, for if we com- 

4 The study was under the direction of Marjorie Fleiss. 
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pare the total amount of mentions of each network, we find a 
striking similarity. In most of the twelve comparisons offered 
in Table 2, the frequency with which a group of stations is men- 
tioned in the personal interview does not differ more than a 
per cent from the frequency with which it is mentioned on the 
posteard. 


TABLE 2 
Stations Mentioned on Postcard for Regular Listening compared with 
Those Mentioned by Personal Interview 





Per cent distribution of stations mentioned 








Stations mentioned Urban Rural 
f . 
eles Day Night Day Night 





Post-Inter- Post- Inter- Post- Inter- Post- Inter- 
ecard view ecard view ecard view ecard view 





Red Network ........ 28 29 32 32 3233 36 37 
Blue Network ..... 13. 12 16 15 eee 15 16 
eee 31 932 30 30 31 29 33 31 
Other Stations ..... 28 27 22 23 20 21 15 16 
Total per cent ..... 100 100 100 =100 100 100 100 =100 


Base—total num- 
ber of stations 


mentioned ....... 795 833 1200 1257 1064 948 1166 1002 
Average number 
of stations men- 
tioned by each 
respondent... 2.1 2.2 3.2 3.3 3.6 3.2 39 3.4 





The similarity of the two sets of results is impressive when 
it is considered that the cards were filled in by one family 
member whereas the tabulation of the stations mentioned in 
the personal interview covers every member of the family. 
There was a period of two to four weeks between the filling in 
of the questionnaire and the personal interviews. On the other 
hand, the classification by networks is rather crude and the 
similarity of results obtained by the two methods might not 
be so close if, instead of networks, individual stations were 
listed. There were, however, too few cases in each single 
listening area to permit such breakdowns. 
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As will be remembered from Table 1, about half the people 
who filled in the postcards consulted with their other family 
members, and half replied without consultation. Two tables 
similar to Table 2 have been computed for each of these two 
groups of families separately. The results were the same, 
indicating that with or without consultation people are well 
able to give a correct average picture of the stations to which 
their families listen regularly. This would lead us to expect 
that if personal interviews pertaining to family members were 
checked against the actual reports of the rest of the family, 
they would also lead to a satisfactory average individual valid- 
ity. The present figures, however, vouch only for written 
reports.° 


THE STATISTICAL VALIDITY OF MAILED REPORTS ON 
FAMILY LISTENING 


The problem of statistical validity enters when we wonder 
what the self-selection of the respondents does to our final re- 
sults. In a run-of-the-mill mail survey only 10 to 20 per cent 
of the addressees answer and if families with different listening 
habits furnished varying degrees of returns, fhe mail surveys 
might be seriously biased. 

There are at least two factors which definitely influence mail 
returns; they are literacy of the respondent and the interest 

5 It is worthwhile to point out how a test of validity would differ if the 
basic question pertained to individual rather than to family listening. 
Then we would have to compare what a respondent says he is doing with 
what he actually does. In this case what we want to measure with our 
basic question would be decisive. Suppose we aim at amount of time 
devoted to a station; a test of individual validity would consist of com- 
paring the proportion of people claiming regular listening to different 
stations with average amount of time actually devoted to these stations. 
The actual listening time would have to be measured either by a mechanical 
device attached to the radio Audimeter) or by interviews made every few 
hours (to minimize memory losses). Such a comparison of ‘‘ average 
amount of listening time’’ and ‘‘ proportion of people listening regularly ’’ 
requires a number of interesting statistical assumptions which to discuss 
would go beyond the scope of this paper. 




















810 PAUL F. LAZARSFELD 


he takes in the topic under discussion. If, for example, people 
are divided into those with completed secondary schooling or 
more, and those who had not been graduated from high school, 
the former are more likely to answer a questionnaire. s it is 
known that attitude toward serious music is highly associated 
with formal education, a tabulation of mailed questionnaires 
would always overrate the frequency of lovers of serious music 
because those who like it are more likely to return the ques- 
tionnaire, due to their higher literacy. Equally well-known 
is the self-selection of interested people. In sending out a 
questionnaire, for instance, to the parents of school children 
asking whether they listen to a child-guidance program, we 
would overrate the number of listeners because those who do 
listen are more likely to answer than those who do not.*® 

It is important to realize, however, that the very fact of self- 
selection in mail returns does not yet determine whether a 
mail survey can be made. If the information we are looking 
for ts not related to the factors which determine responsive- 
ness, then the self-selection in returns does not discredit our 
results. Therefore the way to determine whether there is a 
self-selective bias in the information on station listening 
habits derived from mailed questionnaires is to study first 
whether or not the factors related to responsiveness are also 
related to listening habits. 

The most important factor to be studied is literacy which, 
for our purposes, can be approximated by economic level: mail 
returns would overrate the station preferred by the higher 
cultural levels. Therefore they could be used only if it were 
established that by and large no relation existed between eco- 
nomic status and stations listened to regularly. Offhand, spe- 
cial situations come to mind, where the use of mailed question- 
naires would be unjustified on this very ground. For 
example, a mailed questionnaire to ascertain listening to 

6 For evidence on the bias obtained in questionnaire returns, see Suchman 
and McCandless, ‘‘ Who Answers Questionnaires,’’ in the present issue of 


this magazine, pp. 758. There it is shown that literacy and interest oper- 
ate independently to increase questionnaire returns. 
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WQXR, a New York station which specializes in serious music, 
would certainly overrate the audience because more literate 
people are more likely to like WQXR. On the other hand, pref- 
erence for small non-network stations would be underrated be- 
cause it is known that lower income groups who are less likely 
to return a questionnaire are more likely to listen to small sta- 
tions.’ Therefore the following discussion is restricted to sta- 
tions affiliated with three of the four major networks.® 

It is likely that the program policies of the major networks 
are about the same as far as their appeal to different cultural 
levels goes. But many affiliated stations have considerable time 
left for local programs and it could be that in this way they 
build up followings with different propensities to answer ques- 
tionnaires. 

What actually is true can itself be made the object of empiri- 
eal research. It would be possible and advisable to study the 
program schedules of several network stations for their ‘‘ma- 
turity level.’’ Appropriate rating scales have been developed 
by students in the reading field. This whole reasoning, inci- 
dentally, pertains to station preference only and therewith to 
the complete program schedule of a station. Single programs 
show great differences in the social stratification of their audi- 
ence and therefore mail surveys would be invalid.*® 

No studies are available on the popularity of individual sta- 
tions based on enough cases to permit a decision as to economic 
differences. As to networks as a whole, there is some indica- 
tion that their appeal is the same in all income groups. 

The follow-up study of 669 respondents to a mailed ques- 

7See Fiske and Meyrowitz, ‘‘The relative preference of low income 
groups for small stations,’’ Jnl. of Appl. Psychol., XXIII, 1, February, 
1939. 

8 The Mutual Broadcasting System has to be excluded because its sta- 
tions have more time to develop a local character of their own. 

® Leahy, Helen H., and Morgan, Winona, ‘‘ Cultural Content of General 
Interest Magazines,’’ Journal of Educational Psychology, October, 1934. 

10 A survey of social stratification of the radio audience to specific pro- 
grams by H. M. Beville can be procured at the Office of Radio Research at 
Columbia University. 
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tionnaire, mentioned above, gave an opportunity to divide the 
respondents in the customary way according to their socio- 
economic status into four groups, A (highest) to D (lowest) .™ 
For each group the stations listened to regularly were com- 
bined according to their network affiliations. The results are 
reported in Table 3. 


TABLE 3 


Blue Network and CBS Mentions per 100 Red Network 
Mentions by Class of Home 





For Evening Listening For Daytime Listening 








Network 
A 3 © D tae A B OC D Tots 
) Peat vets 100 100 100 100 10U 100 100 100 100 100 
BD . ccéncinedeatone 47 40 47 51 46 48 43 51 52 49 
_arieeu Fats SARE 115 88 95 85 100 100 107 105 97 103 
Number of re- 
spondents ....... 28 195 339 99 669 28 195 338 99 669 





« Total includes ‘‘ No Answer.’’ 


The number of mentions which go to a Red Network station 
are considered 100, and the Blue Network and CBS are com- 
puted per 100 Red stations. This way it is easier to survey 
whether a consistent economic trend exists. As far as the size 
of this sample goes, none of the differences is statistically 
significant.** 

So far we have discussed the role of literacy in the self- 
selection of mail questionnaire returns. As to the factors of 
interest, the situation seems to be the following one. People 
who listen much and to many stations are more likely to return 
a questionnaire on listening habits.1* Therefore the frequency 

11 For a discussion of these ‘‘intuitive ratings’’ see P. F. Lazarsfeld, 
‘*Interchangeability of Indices in the Measurement of Economic Influ- 
ences,’’ Jnl. of Appl. Psychol., XXIII, 1, February, 1939. 

12 There is also no economic difference if the rural and urban interviews 
are tabulated separately. 


18 Evidence on this point comes from a study made by Elmo Roper 
for the National Broadcasting Company. 
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of listening to every station will be overrated in a mail survey. 
It is not easy to conceive a reason, however, why this factor 
should favor one station more than another. As far as interest 
goes, then, we are inclined to feel that mail surveys attribute the 
different network stations an adequate relative position, but 
over-state the actual ‘‘coverage’’ of each of them. 

Whether there are additional factors which could be danger- 
ous for the use of mail questionnaires has to be left to further 
theoretical analysis and subsequent field work. The proce- 
dure will always be to make personal interviews first and to 
see whether we find variations in regular family listening 
according to given preconceived factors..* As long as such 
differences do not show up, mail surveys can be justified. 

Again the question arises whether the two methods are 
likely to yield very different answers. The author had access 
to 15 studies where different agencies had made mail surveys 
in a county, and at the same time had made personal inter- 
views. The number of postcards returned varied from 70 to 
270; the number of personal interviews from 70 to 880. For 
each study the rank correlation was computed between the 
frequency with which each individual station was mentioned 
on the postcards in the personal interviews; every station men- 
tioned by more than 10 respondents was included in the com- 
putation. Table 4 gives the correlation coefficient (Spearman 
rho) between the two rank orders. 

In seven cases the agreement between the two methods is 
practically perfect; in six cases it is acceptable. Even the 
two lowest correlation coefficients do not go below .50. We 
ean therefore say that even if we consider specific stations, the 
two methods do not give very different results. If more such 

14 It should be kept in mind that under special conditions even personal 
interviews might yield a sampling error. Suppose we want to know the 
average amount of time the radio is tuned in, and every time we do not find 
a family selected for the interview at home, we then interview the neighbor 
to the right. In this case we are likely to over rate the amount of listening 


on an average, because those families who are not at home are likely to have 
a smaller amount of total listening. 
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TABLE 4 
Rank Order Correlation between the Frequency with which Stations were 
Reported as Listened to Regularly in Personal Interviews 
and Simultaneous Mail Surveys 








Correlation Number of Counties 
.91-—1.00 4 
81-— .90 3 
-71- .80 6 
-70 or less 2 





comparisons are done it should be especially worthwhile to get 
general information on the ‘‘problem counties.’’* 

It is important to see the difference between the results just 
reported and the one implied in Table 2. In Table 2 each 
family provides two items of information inasmuch as we com- 
pare their mailed return with their own personal interviews. 
Here the mailed returns come from other people than those 
with whom the personal interviews have been made; compar- 
able information was aspired to by interviewing in urban 
blocks or on rural roads similar to those to which the family 
survey had been addressed.** In other words, Table 2 dealt 
with individual validity of mailed questionnaires and shows 
that they reflected correctly the cumulative information ob- 
tained from all family members; there still could be a sampling 
bias because the families interviewed were those who answered 
the posteard. Table 4 tests jointly individual and statistical 
validity. The discrepancies in the two problem counties could 
be due either to the fact that the information given by the 
informants was influenced by the procedure of approach or to 


14 For instance, it might well be that certain stations which specialize 
in women’s programs are overrated in the personal interview where the 
women are the main informants, whereas other stations which might special- 
ize in sport reviews would be overrated by postcard returns which are more 
likely to be filled in by men. Such comparisons would also bring to our 
attention stations especially preferred by upper or lower income groups 

‘which, in posteard returns, would be favored or penalized, respectively, 

15 Also the classification in Table 2 pertains to groups of stations com- 

bined in networks, whereas here each station is ranked individually. 
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the fact that the mailed returns come from families which are 
not representative of the whole county, as presumably the per- 
sonally contacted families were. 


SUMMARY 


Most of the surveys devised to measure station popularity 
include the question, ‘‘ Which station do you and your family 
listen to regularly?’’ Either the notion of ‘‘regular listening’’ 
can be considered as an attitude and answers treated like the 
answers to any other attitude test ; or an effort can be made to 
translate the answers more precisely into actual behavior, as for 
example the amount of time spent in listening to a station. 
So far all known surveys have used the question as an atti- 
tude test. 

The present paper compares the advantages and disadvan- 
tages of asking this question in direct personal interviews or in 
mailed questionnaires. 

The problem divides itself into the aspect of ‘‘individual 
validity’’ and ‘‘statistical validity.’’ By individual validity is 
meant that in each single case the respondent reported correctly 
the cumulative attitude of the whole family. As far as the 
rather crude notion of family listening goes, the mailed reports 
seem to have certain advantages. A test of more than 600 cases 
showed great agreement between written reports turned in by 
one family member and the cumulative answers of all family 
members interviewed directly. 

By statistical validity is meant the self-selection of respon- 
dents or the bias introduced by the fact that only a small part 
of the addressees return a mailed questionnaire. Here the 
mailed questionnaire can be at best only as good, and no 
better, than a well-sampled group of personal interviews. 
There can be no doubt that people who return questionnaires 
are more literate and more interested in the topic under inves- 
tigation. The question whether popularity surveys by mail are 
valid, therefore, boils down to the question whether station 
preference is related to literacy, to interest in radio matters, or 
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to other factors influencing questionnaire returns, but not yet 
discovered. As far as the comparison of networks goes, avail- 
able evidence does not indicate that there is a major danger 
of bias. But studies on the ‘‘maturity level’’ of program 
schedules are strongly suggested. 

If the proportion of people stating that they and their 
families listen regularly to a station is taken as an index of 
the relative role this station plays in family listening, then 
mail questionnaires, as far as network stations go, seem to be 
a warranted way of collecting this information. 

















V. Measurement Problems 


THE QUANTIFICATION OF CASE STUDIES 


PAUL F. LAZARSFELD anv W. S. ROBINSON 
Columbia University 


PSYCHOLOGY OF CLASSIFYING CASES 


HE use of case studies in psychological and social re- 
+. search is beset by extreme subjectivity which carries 
with it two important disadvantages—non-comparabil- 
ity and lack of precision. One’s criteria of classification tend to 
shift from case to case according to the information which hap- 
pens to appear in the individual case study. Single judgments 
intuitively made on individual cases, moreover, are subject to 
a high degree of error because a single mistake wrongly classi- 
fies the entire case study. 

Consideration of the psychology of classifying case studies, 
however, suggests a method for avoiding these difficulties. 
For example, consider the psychology of classifying case 
studies with respect to their positions on a linear continuum. 
The customary procedure consists in determining whether the 
case as a whole lies to the right or the left of the zero point— 
or some other arbitrarily selected point—on this continuum. 
Psychologically, however, the case study is generally not 
judged as a whole. On the contrary, various bits of relevant 
information are weighed, and a summary judgment based on 
these indications is finally arrived at. In other words, the case 
study is subjectively and implicitly treated as a kind of test 
in which the number and nature of the questions differ from 
case to case. 

The observer reads through the description of the individual 
and looks for specific indications on the point at issue. He 
817 
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may find, and often does find, contradictory indications; but 
in any event his final classification of the case depends upon 
the direction in which the bulk of the evidence falls. 


A SUGGESTED PLAN OF PROCEDURE 


The psychological procedure of classification consists essen- 
tially in making a pseudo-test of each case study. To do this 
one must: 

A. Define the continuum in terms of which the case studies 
are to be classified. This requires an explicit statement as to 
the general quality or characteristic according to which the 
individuals will be classified. 

B. Decide which indicators (i.e., ‘‘bits of information’’) in 
each case study are items on this continuum, i.e., are relevant 
to the classification. If the continuum is that of dominance- 
submission, any specific detail helpful in deciding whether an 
individual is dominant or submissive is an indicator on the 
continuum for that particular individual. Deciding which 
indicators are relevant to the classification thus consists merely 
in specifying or concretizing the definition of the continuum in 
terms of specific items of information. 

C. Give a numerical value, with sign, to each indicator, ac- 
cording to its position on the continuum. To continue the pre- 
ceding illustration, if a particular detail indicates that the in- 
dividual is submissive, it might be designated by —1, to show 
that so far as this particular detail is concerned the individual 
is on the negative side of the dominance-submission continuum. 
If the specific detail indicates that the individual is dominant, 
this fact might be indicated by a+1. A detail relevant to the 
¢lassification but indicative that the individual is neither very 
dominant nor very submissive might be given a value of zero. 
A more detailed rating scheme might also be used if desired. 

D. Combine the ‘‘scores’’ on the different indicators for 
each case study in order to determine a final index of the posi- 
tions of the case on the continuum. The most efficient combi- 
nation of the indicator scores for a single case is their arith- 
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metic mean. If different numbers of indicators are found in 
different case studies, these means will of course be based on 
different numbers of observations. Thus if in a particular 
ease study three positive, one zero, and four negative indica- 
(+3) + (0) + (—4) | 

5 = 


—1/8=-—.125. If for another case study there are four posi- 
tive, two zero, and six negative indicators, the final index will 


be $s ©) +8) =—2/12=-.167. Computing indexes in 


tors are recorded, the final index will be 








this way allows the utilization of all relevant information 
found in the case study ; none need be thrown away. The total 
range of the scores will of course lie between —1 and +1. 

In combining the scores it is neither necessary nor desirable 
that the number of indicators be the same for the different 
eases. It may even be that none of the indicators used in 
classifying one case will be involved in classifying another. 
The suggested procedure in fact merely formalizes a process 
common in daily life. The selection of an employee, for ex- 
ample, is often the result of such a procedure. One observes 
that the applicant speaks positively and seems to know his 
business but also that he avoids making decisions, and one esti- 
mates his desirability in the light of these conflicting indica- 
tions. The selection of another applicant, however, may in- 
volve very different indicators; for example, the way in which 
a man expresses himself in a letter and the recommendations 
he encloses. 

The indicators may of course be weighted in any desired 
manner if this is felt to be necessary. It has been shown, how- 
ever,’ that as the number of items in a composite score in- 
creases, the correlations between scores based on different linear 
combinations of the items rapidly approach +1, so that when 
the number of indicators is greater than six or seven, it is 
doubtful whether weighting is of much value. 

18. S. Wilks, ‘‘ Weighting Systems for Linear Functions of Correlated 


Variables When There Is No Dependent Variable,’’ Psychometrika, Vol. 
3, No. 1, March, 1938, pp. 23-40. 
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ADVANTAGES 


Three important advantages are gained by using the method 
described : 

(1) It is both psychologically and logically sounder than the 
intuitive method. It objectifies the classification by bringing 
the specific steps and procedures into the open, and by demand- 
ing an explicit statement of the basis of classification instead 
of an intuitive snap judgment of the case as a whole. Strict 
comparability between cases is thus maintained. It is also 
perfectly sound logically, since the linear combination of indi- 
cator scores which involve a common factor is a well recognized 
principle of measurement. 

(2) The step which it takes in the direction of quantification 
is very important in practical work. The final result is not 
the customary division of cases into two classes, but rather a 
set of scores graded along a continuum decided upon in ad- 
vance. By introducing the important notion of degree of pos- 
session of an attribute, the present method makes possible the 
use of powerful methods of statistical analysis. 

(3) The sampling variance? of the classification is probably 
reduced by a considerable amount. Suppose for the sake of 
argument that the variance of classifying the entire case study 
intuitively is equal to the variance of classifying a single indi- 
eator. On this assumption, the variance of the final mean 

2 Repeated measurements made even upon the same physical constant 
are subject to accidental errors inherent in the measuring process. The 
classification of case studies is subject to errors of the same kind. Ifa 
selected case study were repeatedly classified by the same observer at 
different times, the final scores or indexes for that particular case would 
thus not have identical values, but would cluster around the true final index 
in a frequency distribution of errors. The standard deviation of this dis- 
tribution is called the standard error of the final index, and can be esti- 
mated without repeatedly classifying the same case study. The square 
of the standard error, a more convenient measure to use than the standard 
error itself, is called the ‘‘variance’’ of the final index. Obviously, the 
smaller the variance of an index, the more precise that index is. It is 
highly desirable to use indexes with minimum variance, since errors of 
sampling are reduced to a minimum by so doing. 
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index will then be 1/nth of the variance of the intuitive classi- 
fication, when is the number of indicators found in the par- 
ticular case study. This may be expressed more picturesquely 
by saying that a single mistake in classifying a case study as 
a whole wrongly classifies the entire case, while a mistake in 
placing one indicator out of several will not have nearly so 
great an influence on the final score. 

In actual practice, however, the advantage will probably 
not be so great as the preceding reasoning indicates, because 
in intuitively classifying a case study the ordinary research 
worker probably takes advantage implicitly of the existing 
specific indicators, at least to some extent. The greatest ad- 
vantage of the present method comes when the number of indi- 
cators is too large for the classifier to keep in mind as he reads 
the case study, when the positive and negative indicators ap- 
proximately balance one another so that intuitive judgment 
needs implementing, when a job is given to a number of classi- 
fiers who do not read every case, and when a quantitative 
statement is desired.* 


A PRACTICAL APPLICATION 


The suggested procedure has been used with excellent results 
in Columbia University’s Office of Radio Research,‘ in a psycho- 
logical study of the panic following the Orson Welles’ broad- 
cast of The War of the Worlds. This radio broadcast was 
given on the evening of October 30, 1938, and described an 
imaginary invasion of Martians on the American continent 
which threatened the entire civilized world. Of six million 
persons who heard the broadcast, over one million were fright- 

8A possible source of error in further use of the final indexes should 
not be overlooked. If the different indicators are assumed to have the 
same sampling variance—an assumption which is probably justified for 
practical purposes—the variances of the final indexes will be different if 
these indexes are based on different numbers of indicators, for the variance 
of a mean is inversely proportional to the number of cases on which it is 
based. This might sometimes be important to know in further statistical 
treatment, e.g., in the analysis of variance. 

4 At that time the Princeton Radio Research Project. 
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ened and many were reduced to a state of panic. People 
prayed, hid in cellars, ran into the streets undressed, called 
the police for advice, warned their neighbors, and even fled 
headlong to get out of the ‘‘danger area.’’ 

In studying the panic it was necessary to distinguish be- 
tween subjects who would and would not be expected to be 
vulnerable in this kind of situation. It was desired also to 
classify subjects as vulnerable and invulnerable without refer- 
ence to their reaction to the broadcast. Available for classi- 
fying the subjects were the records of a detailed interview with 
each. 

The continuum was defined as ‘‘susceptibility-to-suggestion- 
when-facing-a-dangerous-situation.’’ Numerous specific indi- 
cators on this continuum were found in each interview. It 
was possible to classify them under eleven different headings, 
as follows: 

(1) Religiosity, or piety. It was felt that spontaneous indi- 
cation of strong religiousness or piety—in addition to that 
indicated by checking ‘‘ Yes’’ to the question, ‘‘ Do you believe 
that God can and does control events on this earth?’’—was 
indicative of a positive degree of susceptibility, because per- 
sons in this category would be more inclined than others to 
believe in the possibility of a major upheaval of the kind de- 
scribed. Persons with this characteristic were thus given a 
mark of +1 on this factor. Persons merely responding ‘‘ Yes’’ 
to the above question were given a mark of zero, because they 
were characterized by a lesser probability of susceptibility. 
Persons checking ‘‘No’’ were given a mark of — 1. 

(2) Church attendance. This indicator was scored in a 
similar fashion to the first. It was felt that those who at- 
tended church with great regularity were more inclined to be 
susceptible than others, mainly because of the high correlation 
between church attendance and religiousness. Persons who 
attended regularly were given a mark of +1. Persons who 
did not attend church, on the other hand, were felt to be less 
susceptible, and were given the mark of —1. Zeroes were 
given to those with an intermediate degree of attendance. 
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(3) Fatalism. Persons who believe themselves governed by 
powers beyond their control should be more likely to believe 
in the existence of a reported catastrophe than others. Per- 
sons who exhibited a high degree of fatalism were thus marked 
+1, while those exhibiting no appreciable degree of fatalism 
were marked —1, and zeroes went to intermediate degrees of 
this characteristic. Ratings on this factor were derived from 
answers to the following question: ‘‘Does man’s life on this 
earth seem to you meaningless, temporary or futile?’’ ‘‘What 
sort of catastrophe did you think it was?’’ 

(4) Racial prejudice. On the hypothesis that racial preju- 
dice sometimes results from implicit and perhaps unconscious 
fear or feelings of inferiority or disappointment, which are 
relevant to the panic situation, persons with a high degree of 
racial prejudice were given marks of +1, and marks of zero 
and —1 were distributed as in the case of the factors already 
listed. 

(5) Insecurity. The relation of this characteristic to the 
panic situation is evident without elaboration. Information 
as to this factor was derived from answers to the following 
questions ; ‘‘Is the security of your job dependent on business 
conditions or the friendship of certain people?’’ ‘‘ What are 
the things in your life which you would like to have differ- 
ent?’’ ‘‘What are the things you worry most about?”’ 

(6) The possession of miscellaneous phobias. The posses- 
sion of phobias indicates a personality set favorable to panic. 
Indicators for this factor were gained from answers to the 
question, ‘‘ What three things are you most afraid of ?’’ 

(7) Amount of worry. Information on this factor, which 
has an obvious relation to susceptibility, was gained from an- 
swers to the question, ‘‘Do you think that you worry more than 
other people?’’ 

(8) Lack of self-confidence. Persons were rated on this 
characteristic by means of answers to a question dealing with 
their readiness to argue in public. 

(9) Agreement with scientists. Those who believe the state- 
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ments of science are not inclined to believe in things which 
science has declared to be very improbable. Hence agreement 
with scientists should be conducive to disbelief in a report of a 
Martian invasion of the ‘‘end of the world.’’ 

(10) Altruism-egoism. Persons with primary concern for 
the welfare of others, e.g., parents, should remain calmer than 
those with primary egoistic concern. 

(11) Miscellaneous. In a few instances additional criteria 
were employed. This heading includes factors such as chronic 
nervousness and emotional instability when their existence 
could be reliably inferred from the interview. 

The method of scoring is shown in the following scheme, 
indicating the computation of scores for three fictitious indi- 
viduals : 











Indicator 
Individual Score 
(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) 
Bi aes +1 oh: * 42 +.750 
a +1 0 +1 +1 +.750 
ee +1 +1 +1 41-1 +1 +1 +1 +.750 





The frequency distribution of susceptibility scores for 100 
subjects is given below. Positive scores indicate susceptibility 
and negative scores lack of susceptibility. 











Education 
Beore Less than High school 
high school or more 

SL eee | | en 1 1 
Sk a Sn 3 7 
Sl a) ee 8 11 
-i1St— 3 ........... 9 14 
ETS Ee etc < 12 13 
12 to BP i idinwtnnn 9 8 
34 to eke. Soa wal 2 2 
56 to a m nas _ 

. i oo’ Me Mm . ae 
44 56 
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These scores, which allow discrimination of degree of sus- 
ceptibility, have stood all tests for validity to which they have 
been subjected. They have in addition been instrumental in 
verifying two hypotheses with which the study commenced. 
(1) The mean score for persons with less than high school edu- 
eation is —.113, while that for persons with high school edu- 
cation or more is —.181,° indicating that amount of education 
is negatively related to susceptibility. (2) The degree of panic 
actually exhibited by each subject was objectively rated quite 
independently of the susceptibility scores, which were then 
correlated with the behavior ratings. The correlation coeffi- 
cient for those with less than high school education was + .30, 
while for those with high school education or more it was + 26. 
Both of these coefficients are significant on the .95 confidence 
level.® 

The suggested method should be of value wherever it is 
necessary to classify case studies or interviews. It is proving 
of value in two additional studies in the Office of Radio Re- 
search at present. One involves discriminating between buy- 
ers on a rationality-irrationality attitude continuum, and the 
other involves determining the degree of authority exerted by 
heads of families. 

5 This difference is statistically significant on the .95 confidence level. 

6 Further discussion of this example, employing a less complicated form 
of the same method, can be found in The Invasion From Mars (Hadley 


Cantril, with the assistance of Hazel Gaudet and Herta Herzog), Prince- 
ton University Press, Princeton, N. J., 1940, pp. 128 ff. 








STATISTICALLY SIGNIFICANT DIFFER- 
ENCES IN OBSERVED PER CENTS 


CUTHBERT DANIEL 
Office of Radio Research, Columbia University 


T frequently happens that the proportion of individuals in 
one group who have some stated property is claimed to be 
greater than the proportion having this same property in 

another group. ‘‘More men than women listen to forum pro- 
grams.’’ ‘‘Persons of higher income listen more than others 
to educational programs.’’ ‘‘Urban listeners are more inter- 
ested in foreign affairs programs than rural listeners.’’ Such 
statements abound in research reports and tables. 

Generally these judgments are made on the basis of sample 
groups of individuals. A hundred men are interviewed and 
a certain proportion (say A per cent) are found to listen to 
news analysts and commentators. A hundred women are 
interviewed and a smaller per cent (say B per cent) report 
regular listening to news analysts. The question which this 
paper plans to answer is: 

By how much must A exceed B so that one can be 95 per 
vent sure that the difference found is not due to the smallness 
of the sample? 

All ealeulations can be eliminated by the use of Table 1, once 
the per cents A and B have been computed, and provided con- 
ditions 1 to 5 below have been satisfied. The table gives the 
amount by which A must exceed B for different observed 
values of B and for sizes of samples from 20 to 1000. 

It is also possible to use this table to estimate the signifi- 
eance of differences in a single sample. It is only necessary 
to enter the table in the column most nearly indicating one- 
half the actual sample size. 
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TABLE 1 


By How Much Must a Per Cent (A) Observed in One Sample Differ from 
That Observed (B) in Another Sample of the Same Size to be Statis- 
tically Significant? (0.05 Level of Significance) 





Lower — Size of Each Sample 





per 
cent(B)20 25 30 35 40 45 50 60 70 80 90 





SS ems 15.8 14.7 13.3 12.2 11.2 10.5 
a tie 26.0 23.6 21.7 20.1 18.8 17.8 16.1 148 13.8 13.0 
eee 30.9 27.4 25.0 23.1 21.5 20.2 19.2 17.4 160 15.0 14.1 
cs 30.8 27.6 25.3 23.4 21.9 206 19.6 17.9 166 15.5 146 
oe 29.6 26.7 245 22.8 214 20.2 19.2 17.6 163 15.3 14.5 
pore 27.3 24.8 22.8 213 20.1 19.0 18.1 16.7 15.5 146 13.8 
een 23.8 21.7 20.2 18.9 17.8 17.0 162 15.0 13.9 13.1 12.4 
OR ins 17.5 164 154 146 13.9 13.3 124 116 109 10.4 
ee 3 92 GA 798 TH 72 





Size of Each Sample 





100 120 140 160 180 200 250 300 400 500 1000 





Pe 9). 82 88... 758 FA O67 63 6B 465 40 32338 
oe. inns 12.2 110 102 95 89 84 75 68 58 52 3.46 
ee 13.4 12.2 112 105 99 93 83 76 65 58 41 
Dies 13.8 12.6 11.7 109 103 98 87 80 69 61 43 
ce 13,7 12.5 11.7 10.9 103 98 87 80°70 62 4.4 
OP ous 13.1 12.0 112 105 99 94 84 7.7 67 60 43 
a 119 109 102 95 90 86 7.7 7.1 62 55 4.0 
so .. 99 923 86 80 76 73 65 60 53 47 84 
ae 6S: 64 68 S73 G4 Gl 47 43 88 S84 25 





CONDITIONS, REQUIREMENTS, RESTRICTIONS 


It is recommended to check each of these each time the table 
is used. 

1. Two independent samples of roughly the same size 
(within 40 per cent in numerical size) are available. 

2. A per cent A, having some property, has been found in 
one sample to be greater than a per cent B, found to have the 
same property in the other sample. 

3. Both samples must be random, that is, every individual 
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in the eligible population must stand an equal chance of being 
in the sample. 

4. Interpolations, that is, use of values between entries in 
the table, are accurate throughout. Use of the table below 10 
per cent, above 90 per cent (for B), below twenty individuals 
or above 1000, is not permissible. 

5. For cases where A just exceeds B by the indicated amount, 
the use of this table will give an erroneous answer about one 
time in twenty. This is because, in samples of the sizes 
indicated, even truly random samples will represent the actual 
population erroneously, one time in twenty (or five per cent 
of the time). For this reason one says that the table gives 
**95 per cent certainty.”’ 


TO USE THIS TABLE 


1. Select from the left-hand column the lower per cent 
found (B). 

2. Move to the right until you are in the column most closely 
representing the size of the two samples used, or the average 
size if the two are within 40 per cent of each other. 

3. See if the number in that cell represents a larger differ- 
ence than the difference of your lower and upper per cents. 

4. If it does, the two per cents do not differ significantly, 
since the difference may be due to sampling errors. If it does 
not, we can say that the difference is statistically significant 
since it would occur less than 5 per cent of the time by chance. 


THREE ALTERNATIVE AND EQUIVALENT STATEMENTS OF THE 
MEANING OF THIS TABLE 


a. Only 1 time in 20 will two per cents calculated from two 
samples of this size differ by chance by an amount as large as 
that shown in the table. 

b. If two per cents differ by less than the amount indicated 
in the body of the table, then no conclusions can be drawn as to 
the meaning of this difference since it may be due to errors of 
sampling. 
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ce. The numbers in the table may be called the doubled-stan- 
dard-error of the percentages in the left column. This error 
will only be exceeded due to sampling variations, 5 per cent of 
the time. 
| EXAMPLES 


A. Seventy per cent (70%) of the 80 respondents to a ques- 
tionnaire who called themselves ‘‘Liberals’’ were pro-Ickes in 
a debate between Ickes and Gannett. Eighty-one per cent 
(81%) of the 87 persons interviewed in another study were 
‘*Liberal’’ and ‘‘pro-Ickes.’’ Does this mean that there was 
a larger percentage of ‘‘Liberal’’ pro-Ickes persons in the 
population from which the interviewees were chosen than 
among the questionnaire respondents? 

Looking at the table in row 70 per cent and in column 80 
(sample size), we find the number 13. This means that 70 per 
cent does not differ from 81 per cent ‘‘significantiy’’; i.¢., 
such a difference would turn up by sampling errors more than 
one time in twenty. To see if 87, the actual sample size, dif- 
fers significantly from 80, the column used, look in the next 
column. N=90. The minimum significant difference is here 
12, and hence we are assured that 70 per cent and 81 per cent 
do not differ significantly in two samples of size 87. 

B. Interpolation for values between those in the table. Sup- 
pose the proportions of males in the audiences to two broadcasts, 
who like each broadcast, are as follows: 











Per cent Size of 

males sample 
Liked broadcast A 25.2 40 
Liked broadcast B §1.3 60 








Looking at the relevant per cents and sample sizes, we take the 
following significant differences from Table 1 and repeat them 
in Table la. Thus, for 25 per cent, the interpolated row would 
read 21 and 17. 

Our observed difference in per cents is 26, and therefore is 
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TABLE la 
Part of Table 1, here reproduced for a specific example 














N 
Per cent 
40 60 
20 20 16 
30 22 18 








statistically significant. It should be elear that if this ob- 
served difference had been 21 per cent points, we could still be 
sure the difference is statistically significant. For, if both sam- 
ples had been of size 40, then 21 points would be significant, but 
one of our samples is larger than this and hence the required 
difference is a little smaller than 21 (actually 20.4) .* 

1 Those interested in theoretical statistics will recognize the equivalence 
of this table to a ‘‘ Chi-square caleulation’’ for a 2x 2 contingency table 
under the restriction of roughly equal totals for the two rows. The table 
is merely the tabulation of the results of the Chi-square calculation carried 
out repeatedly for critical values of the varieties concerned. 























PRELIMINARY REPORT ON FACTORS IN 
RADIO LISTENING 


W. 8. ROBINSON 
Columbia University 


be analyzed rather superficially. Investigators gener- 

ally content themselves with isolating background fac- 
tors such as age, sex, income, and education which seem to be 
related to differential program preferences. In this analysis 
the individual radio program itself is of chief interest, and 
little attention is given to the problem of classifying programs 
in terms of more fundamental criteria or ‘‘factors.’’ 

In order to study the possibilities of a more fundamental 
analysis, the Office of Radio Research has planned a large- 
seale factor analysis of program preferences. Preparatory to 
this detailed investigation, however, some data already in the 
files were analyzed to determine the feasibility of the more de- 
tailed study. The results of this preliminary analysis are 
presented here. 

The present analysis deals with the preferences of 146 Prince- 
ton University students for 18 types of radio program. Each 
student indicated his preference for each of the program-types 
by one of the following responses: Like, No Preference, Dis- 
like. The intercorrelations between program-types were then 
computed using tetrachoric coefficients, the ‘‘no preference’’ 
replies being pooled with the ‘‘like’’ or ‘‘dislike’’ replies so as 
to divide the distributions as near to the median as possible. 
The list of program-types is given below: 


A T the present time radio program preferences tend to 


(1) Comedy and variety 
(2) Serious music 

(3) ‘‘Sweet’’ dance music 
831 
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(4) ‘‘Hot’’ dance music 
(5) Folk and band music 
(6) Dramatized news 
(7) Dramatic plays 
(8) Serial stories 
(9) News bulletins 
(10) Amateur hours 
(11) Quiz programs 
(12) News commentators 
(13) Forums, talks, discussions 
(14) Dramatizations of historic or scientific facts 
(15) Sports news 
(16) Sports events 
(17) Programs on personal problems 
(18) Religious programs 


The intercorrelations are shown in Table 1. 





METHOD 


The intercorrelations of Table 1 were factored by the center 
of gravity method.? Tucker’s empirical criterion® ¢= 
2fe+1_n-1 
LPs n+l 


factors to be drawn from the correlation matrix. For 18 vari- 





was employed to determine the number cf 


ables the value of nas is .895. After three factors had been 
extracted the value of ¢ was .885; this was considered suffi- 
ciently close to the limiting value for the exploratory analysis, 
and a fourth factor was therefore not removed. 

The arbitrary orthogonal reference frame given by the cen- 
troid method was not interpretable. It was therefore rotated 
graphically to a unique position with respect to the coordinate 
axes using Thurstone’s method.‘ In the rotation, however, it 
was not possible to do away with all significantly negative 
loadings. In fact in a problem of this kind there seems no 
need to do so. The rotational criterion adopted, consequently, 
was merely the maximizing.of zero and near-zero loadings. 


2L. L. Thurstone, Vectors of Mind, Chicago, 1935, pp. 232-250, 
8L. L. Thurstone, Primary Mental Abilities, Chicago, 1938, pp. 46-67. 
4 Primary Mental Abilities, pp. 71-77. 
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This criterion resulted in a psychologically meaningful rotated 
factorial matrix, shown in Table 2.° 


Interpretation of the Factors 


In order that a factor may be given an unequivocal interpre- 
tation, it is necessary that the loadings be derived from tests 
of sufficient heterogeneity to allow identification of the com- 
mon element. It is to be expected, therefore, that identifica- 
tion of the factors of Table 2 will be extremely tentative, be- 
cause types of radio programs do not differ as clearly as indi- 
vidual programs. Had individual programs been used, as is 
planned in future analyses, the task of identification would 
have been considerably simplified. With the material in hand, 
however, it is possible to gain some insight into the nature of 
the factors. 


5 The matrix derived by the centroid method gives the factorial com- 
position of the table of intercorrelations, but with reference to a purely 
arbitrary set of coordinate axes. Centroid factorial matrices for this 
reason rarely make psychological sense, and it is thus necessary to choose 
another set of coordinate axes which will make sense. Various criteria are 
used in choosing meaningful axes. In dealing with ability tests, Thur- 
stone has used two criteria: (1) the maximizing of zero or near-zero 
entries in the factorial matrix, on the thesis that few tests will call for 
all of the abilities in significant amounts; and (2) the elimination of sig- 
nificantly negative entries, which are uninterpretable because it is difficult 
to think of an ability which is a detriment to the performance of a test. 
In practice, negative entries from .000 to —.200 are traditionally permit- 
ted, since a loading of .200 or less on a factor means that that factor con- 
tributes only four per cent or less of the variance of the test in question. 
In dealing with ability tests, Thurstone found that when (1) was satisfied 
, (2) also was satisfied, that when the number of zero or near-zero loadings 
is maximized, significantly negative entries disappear. There is no reason, 
however, why radio programs should always show positive saturations in 
the reference traits of the factorial matrix. It is easy to think of traits 
which would cause one not to listen. For this reason, Thurstone’s cri- 
terion (2) was not employed; #t was in fact impossible of attainment. 
Criterion (1) on the contrary, has distinct relevance to the problem at 
hand, for it seems logical to believe that few radio programs will involve 
all the factors to a significant degree. 
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TABLE 2 
Rotated Factorial Matriz 








I II IIt Communality® 
1 575 087 225 389 
2 — .248 114 — 460 -286 
3 — .004 418 313 273 
4 -085 — .382 305 -246 
5 005 463 -010 215 
6 473 556 082 540 
7 448 -250 056 -266 
8 225 456 439 451 
9 134 -338 — .293 .218 
10 -053 -650 -385 574 
11 -446 -206 — .086 249 
12 198 580 — 416 549 
13 — .169 -759 — .310 -701 
14 475 503 — .006 479 
15 633 100 072 416 
16 844 005 097 722 
17 002 436 428 373 
18 -006 627 — 125 -409 





6 The communality is the proportion of the variance of a program-type 
attributable to the three factors together. The square of any factor load- 
ing is the proportion of the program-type variance attributable to the 
factor indicated. 


Factor I. The program-types with significant loadings 
(greater than .400) on Factor I are the following: 


5 SESE ae 844 
EEL Ma .633 
( 1) Comedy amd variety i.cccccccccccesccssssessonsscserseeseieetn 575 
(14) Dramatizations of historic or scientific facts .475 
cea seein A473 
Dh eC 448 
fi 0” SS, Se eee 446 


Each program-type with a significant loading on this factor 
either emphasizes or directly involves drama of some kind. 
The connection is clear enough so far as sports, scientific and 
historic dramatizations, dramatized news, and dramatic plays 
are concerned. Quiz programs also, however, involve compe- 
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tition to an important degree. Moreover, the element of drama 
appears in many of the well-known comedy and variety pro- 
grams, and competition is stressed by several of them (e.g., 
Jack Benny and Fred Allen). Serial stories, moreover, have 
a loading of .225 on this factor, and this suggests the same 
common element. The only other program-types of the 18 in 
which drama customarily appears, though not so strongly, are 
news bulletins (.134) and news commentators (.198), and 
these loadings, while not significant, are at least suggestive. 
Factor I, then, is tentatively identified as a drama factor. 

Factor II. The program-types with significant loadings on 
Factor II are as follows: 























(13) Forums, talks, discussions .759 
(10) Amateur hours .650 
(18) Religious programs .627 
(12) News commentators .580 
( 6) Dramatized news : 556 
(14) Dramatizations of historic or scientific facts .503 
( 5) Folk and band music .463 
( 8) Serial stories 456 
(17) Programs on personal problems ........................ 436 
( 3) Sweet dance music 418 





This list is very confusing, probably because of the very gen- 
eral nature of the program-types. A given type designation 
may involve so many different facets that it is difficult to see 
what all these types could possibly have in common. 

Without detailed discussion, it is suggested as a rather wild 
speculation that Factor II might on more detailed study turn 
out to be an ‘‘inspirational’’ factor, a factor involving awak- 
ening, quickening, a sense of the meaning of things. To some 
extent the catalogued program-types bear out this interpreta- 
tion. Forums, talks, and discussions could contribute to this 
kind of need, and religious programs also. Programs on per- 
sonal problems find a place in the list quite readily. It is con- 
ceivable that news commentators, dramatized news, and drama- 
tizations of historic and scientific facts could contribute to the 
same kind of desire. The inclusion of amateur hours, however, 
at least on first glance, is unfortunate. Yet the best known 
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amateur hour (Major Bowes) involves a distinctly inspira- 
tional point of view. Folk or band music and ‘‘sweet’’ dance 
music, however, have an anomalous position in the list. It is 
conceivable that they have an inspirational function, but the 
case is not very strong. There are but two program-types of 
possible inspirational character that do not appear in the list, 
and these are serious music (.144) and dramatic plays (.250) ; 
both with positive but non-significant loadings. The single 
program-type with a negative loading is ‘‘hot’’ dance music 
(— .382), and this nearly significant loading perhaps tends to 
bear out the interpretation suggested. Naturally, however, 
the analysis above is rank speculation. Only a more detailed 
analysis with individual programs rather than program-types 
would serve to indicate the nature of Factor IT in anything 
approaching a precise fashion. 

Factor III. Program-types with significant loadings on 
Factor III are as follows: 


Ce neem 439 
(17) Programs on personal problems ............... 428 
[Rp eee Se ec... — 460° 


The number of program-types in this list is unfortunately not 
large enough even for speculative purposes. 


CONCLUSION 

Three points seem worth mentioning : 

(1) In the analysis of program-preference intercorrela- 
tions, Thurstone’s criterion of maximizing the number of zero 
or near-zero entries in the centroid factorial matrix appears 
to result in a psychologically meaningful rotated matrix. 

(2) In the present analysis two factors involved in pro- 
gram-preference intercorrelations have been tentatively iden- 
tified: (I) a drama factor, with some degree of assurance, and 
(II) an ‘‘inspirational’’ factor, with considerably less assur- 
ance. 

(3) The need for using correlations between preferences for 
individual radio programs rather than correlations between 
preferences for general types of programs is clearly indicated. 
7 See footnote 5 for the interpretation of significantly negative loadings. 
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THE PROBLEM 


OMMERCIAL and educational research agencies devote 
ty much energy to studying the program preferences of 
different groups of radio listeners. In this research, 
interest centers mainly in isolating factors correlated with 
differential preferences. Typical of such factors are age, 
income, education, and sex. 

One important factor underlying differences in program 
preferences, however, has received little or no attention. This 
factor is the amount of radio listening engaged in. It will be 
shown in this paper, for instance, that persons who listen to a 
large number of radio programs exhibit program preferences 
very different from those of persons who listen little. 

The main purpose of this paper is to develop a method for 
determining the influence of amount of listening upon pro- 
gram selection. The method is designed to answer questions 
such as the following: Do persons who listen to symphony 
programs (¢.g., the Ford Hour) hear more or less other pro- 
grams than average? Do the persons who select a very popu- 
lar program (e.g., Jack Benny) listen to a normal amount of 
other programs, or do they listen to a greater-than-normal or a 
less-than-normal number? Are news programs more often 
selected by individuals who hear little else? In general, do 
programs appear in individual ‘‘listening budgets’’ in the 
amount one ‘would expect from their general popularity, or 
are some programs heard more frequently by individuals who 
hear a great many or very few programs? 
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The problem presented by these questions is not simple. 
It is not possible to make direct comparisons of the frequency 
with which a given program is heard by groups which differ 
in amount of listening, because the probability of choosing a 
program increases with the total amount of listening. A 
person listening to five programs, for instance, is more likely 
to hear any given program than a person who listens to one 
program only. 

THE METHOD 


Assume that all the programs heard by a number of people 
during a given interval of time have been reported. A list of 
programs which are of interest to the investigator is then 
compiled from the more general list of all programs on the 
air during this interval. The selected programs must of 
course afford freedom of choice between them. Every person 
in the group who heard one or more of the programs on the 
investigator’s list is now selected as a member of a group of 
subjects whose preferences will be analyzed. The investigator 
thus has a number of listening reports, each report indicating 
which programs from the selected list a particular individual 
has heard completely. 

The present analysis thus begins with the listening reports 
of a number of people. In each report it is indicated whether 
the corresponding individual did or did not completely hear 
each program on a list. Each one of these subjects has heard 
at least one of the listed programs. Most of the subjects have 
heard more than one of these programs, and often a few per- 
sons have heard all of them. 

These listening reports permit the investigator to distinguish 
between listeners with respect to amount of listening. The 
total number of programs which were reported heard by a 
given individual is an index of that individual’s amount of 
listening so far as the listed programs are concerned.* 

1 There is a little unavoidable overlap in time in the materials used in 


this paper but for the purposes of demonstrating a method these were 
ignored. 
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TABLE 1 


Distribution of Audits to Ten Chosen Evening Programs by the Number 
of These Ten Programs Heard Completely 











No. of No. of No. of Program 

programs listen- - 

heard =n “sb Oo Bb BP Se aT Ss 
ies Mikaeniin 65 65 12 + SESS 0 A Ss: See ee 
ee 63 a ae a a ee ee Ne 
fee o 73 _—  -— a «6 26 fe 8. oF" + F 
6 tome 51 204 47 45 30 18 10 19 9 11 4 ii 
| pe ee 30 ee coy Oe OR ee ee ee ee ee 
Serer a a lage ig RR eas ie i Se a a 
Be RRS ei AER 1 aS | 2 Se TO” ee 
De xdihnai Taa a hdl “ttl” eats: scenes wlan: accdig ladle ala a es 
DP iiaiiscniels cihe hie TMT. cies. 2 Reps WeMMR TAGE a tie’: alae (~ iultes 2 gud Vahiatal Sl goad. scene 
RR i RES ea aap ae LI en Bors Sai ela ally allie 

TE. cetone 282 764 184 167 119 56 46 43 40 40 37 32 





The information given in the listening reports may be sum- 
marized in a form similar to Table 1, which shows the number 
of audits given to each of ten programs by the number of pro- 
grams heard completely. An ‘‘audit’’ is the hearing of one 
program by one person. If one person hears five programs, 
the result is five audits. If five people in the group hear the 
same program, the result is also five audits. 

Table 1 is easily interpreted. The first line indicates that 
65 persons in the group of 282 heard only one program. Of 
these 65 persons, 12 heard Program A, 4 heard Program B, 
7 heard Program C, and soon. The second line indicates that 
63 persons heard just two programs. Since each person heard 
two programs, the total number of audits is 63x2=126. Of 

the 126 audits, 37 were given to Program A, 33 to Program B, 
and so on. It will be noted that none of the 282 persons in 
the group heard more than five of these ten programs. 

The problem to be solved can now be restated in terms of 
Table 1. Consider Program A. Out of 184 audits to this 
program, 12 (6.5%) came from listeners who heard only one 
out of the ten programs with which the table deals, 37 (20.1%) 
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came from listeners who heard two programs, 58 (31.5%) came 
from listeners who heard three programs, and so on. The 
question is, are the numbers of audits given to Program A by 
listeners who hear different numbers of programs what would 
be expected if specific programs audited were unrelated to 
number of programs heard. 

Now this question can be answered, not only for Program A 
but for all the programs of Table 1 by a method which is very 
simple in principle. This method consists in (1) assuming 
that a person’s, preference for a given program is completely 
unrelated to the number of programs that he hears, (2) com- 
puting the number of audits which persons hearing 1, 2, 
3, . . . N programs would give to the program if this assump- 
tion were true, and (3) comparing the actual audits given this 
program with the audits computed on the foregoing assump- 
tion. The method of doing this will now be given in detail. 

It is necessary first to compute the proportion of the total 
audits which were given to each of the ten programs. This 
is simply the relative frequency with which the entire group 
of subjects, irrespective of number of programs heard, listened 
to the ten programs. These figures are given in Table 2. 


TABLE 2 
The Ten Selected Programs by Percentage of Audit 











Number of Proportion 

Program audits of eudits 
ae 184 .2408 
BP a te aa 167 .2186 
eA aa ere 119 .1557 
Pe 56 .0733 
| Sea 46 .0602 
Me |» stage S viv 43 .0563 
ae 40 .0524 
Oe Se ee 40 .0524 
acca Sula 5 Oe PD 37 .0484 
eP. cdbicthecasl 32 .0419 
764 1.0000 
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The second step consists in computing the number of audits 
which would have been given to Programs A to J, for each 
amount-of-listening group, on the assumption that amount of 
listening is unrelated to choice of program. This must be 
done separately for each amount-of-listening group. 

a. Persons who heard only one program. The relative fre- 
quencies with which the programs were heard by the group as 
a whole are given in Table 2. The 65 audits to Programs 
A to J for persons hearing only one program, therefore, should 
be distributed among these programs in the proportions indi- 
cated in Table 2. In other words, if number of programs 
heard is unrelated to program choice, the distribution of audits 
between programs A to J for the 65 persons hearing only one 
program will be the same as for the group as a whole. 

For Program A, there should thus be (65) (.2408) audits, 
or 15.7. There should similarly be (65) (.2186) =14.2 audits 
for Program B, and (65) (.1557) =10.1 for Program C. The 
theoretical number of audits for the remaining programs are 
computed in similar fashion. 

In this way the first row of a table similar to Table 1, but 
showing the number of audits which would be expected if there 
were no relation between amount of listening and program 
preference, can be built. To this point, the results are given 
in Table 3. Note that in Table 3 the total number of theoreti- 
cal audits, 65, is equal to the total number of actual audits for 
the first row of Table 1. 

TABLE 3 


Theoretical Distribution of Audits to Ten Selected Programs for 
Listeners Who Heard But One Program 





Program A B os aoe es a 





OC: 15.7 142 101 48 39 3.7 34 3.4 3.1 2.7=65.0 





b. Persons who heard two programs. Table 1 shows that 
63 persons,heard just two programs. An assumption about 
the nature of the listening situation is now introduced, namely 
that a listener exhibits differential preferences for the pro- 
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grams he hears. This assumption has sound basis in fact. It 
will therefore be assumed that if a person hears two programs, 
one of these represents his first choice among the programs 
considered, and the other represents his second choice. If it 
came to hearing only one of the two programs, he would not 
hear his program of second choice. Secondly, it is assumed 
that the number of listeners in any amount-of-listening group 
who choose a given program is proportional to the frequency 
with which that program is heard by the entire group relative 
to other programs available for choice (Table 2). This is an 
assumption vital to the validity of the present method since 
the theoretical audits are based upon it. 

On these assumptions, then, among the 63 persons who 
heard two programs, (63) (.2408) =15.2 persons will choose 
Program A as first choice.? None of these 15.2 persons, how- 
ever, can choose Program A as second choice yet each of them 
did listen to a second program. On the assumption that pro- 
gram preference is not related to amount of listening, these per- 
sons would choose among the remaining programs with the 
same relative frequency as the entire group, as indicated in 
Table 2. Of the 15.2 theoretical persons selecting A as first 


(15.2) (.2186) (15.2) (.2186) 

1— .2408 iA .7592 
sons* would select Program B as second choice. Similarly, of 

2 No assumption is made that Program A was chosen first in time. The 
results of this analysis would be the same if it were assumed that any 
other of the ten programs was chosen first in order of time. It will be 
observed later that temporal order of choice, while introduced as a method- 
ological device, is at the end ruled out of consideration because al! possible 
temporal orders are considered. The order assumed is order in preference, 
not order in time. 

8 The divisor (1—.2408) =.7592 is introduced to change the distribu- 
tion of audits to Programs B to J in Table 2 to a 100 per cent basis. The 
15.2 theoretical persons choosing Program A are divided into smaller 
groups according to their second choice of other programs in proportion 
to the pereentages of Table 2, and the divisor is introduced so that the 
number of persons choosing programs B to J on second choice after A 
will be 15.2. 





choice, therefore, =4.4 per- 
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(15.2) (.1557) 
1 — .2408 
Program C as second choice. The remaining second-choice 


audits for Programs D to J are computed similarly, so that 
the second-choice audits for Programs B to J for the 15.2 
persons choosing A first are as follows: 





the 15.2 persons, =3.1 persons would choose 


TABLE 4 


Distribution of Second-Choice Audits for the 15.2 Theoretical Persons 
Choosing Program A First (Persons Hearing Two Programs) 





Program: B Cc D E Vy G H I J 





Audits: 44 81 15 18 i1 #18 10 18 9 =15.2 





On the same assumptions, among the 63 persons who heard 
two programs, (63) (.2186) =13.8 persons will choose Pro- 
gram B as first choice. Each of these 13.8 theoretical persons 
also listens to another program as second choice, and these 
second-choice audits are distributed among Programs A, C, 
D, ...J in proportion to the listening frequencies of the 
entire group as shown in Table 2. Of the 13.8 theoretical 
(13.8) (.2408) 

1— .2186 





=4.2 should listen to Pro- 
(13.8) (.1557) 


persons, therefore, 








gram A as second choice. Similarly 7814 = 2.7 
should hear Program C as second choice, me), eee =13 


should hear Program D as second choice. The theoretical 
numbers of persons hearing the remaining programs as sec- 
ond choice are computed in similar fashion. The second- 
choice audits for Programs A, C, D,...J for the 13.8 
theoretical persons choosing B first, are given in Table 5. 


TABLE 5 


Distribution of Second-Choice Audits for the 13.8 Theoretical Persons 
Choosing Program B First (Persons Hearing Two Programs) 





Program: A 3B «Ss ee Te: eee I J 





Audits: 43 =— 87 18 ii 28 9 9 9 .7=13.8 
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On the same assumptions, among the 63 persons who heard 
two programs, (63) (.1557)=9.8 persons will choose Pro- 
gram C as first choice. The second-choices of these 9.8 per- 
sons can then be distributed among Programs A, B, D, 
C,...dJ as has been done previously for those choosing 
Program A and Program B as first choice. In a similar 
fashion the (63) (.0733)=4.6 theoretical first choices for 
Program D can be distributed among Programs A, B, C, E, 
F, ...J as second choices and the same procedure can be 
repeated until the first choices for each program have been 
distributed among the remaining programs. The results to 
this point may be summarized in Table 6. 


TABLE 6 


Theoretical Distribution of First- and Second-Choice Audits for the 63 
Subjects Who Heard Two Programs 





First- Number Second-choice audits for remaining programs 
choice of first- 
program choices - A > 2 2 ew 2: Sew: 2. 








I Voices | are 46 81 26 22-214-12 18 18 69 
Bix 285 4.3 nw BF 28 Li 18 OD 6S 09 6.7 
Fakes 9.8 a ee a ae ae eee ae ae 
ge es 4.6 am 2a * £ So Be AS 
ie ee 3.8 1.0 Sof 2 Se: Ho BQ Bo Ae 
F 3.5 9 . = oe ee ae ee A | 
G 3.3 8 SY Se Se? Ba eee . oo ee 
__ peter 3.3 8 hee SS eee ieee” Bee . ee 
easter Se 3.1 8 . a a a. a ae ob 
J 2.6 7 oe” i oe ce” eee | l 





63.0 133 124 99 53 43 4.1 3.7 36 35 2.9 





According to the assumptions made, the sums in the last 
row of Table 6 indicate the number of second-choice audits 
which should have been given to Programs A to J by those 
persons hearing only two programs. The numbers of first- 
choice audits are given in the second column of the table. 
In order to find the total number of first- and second-choice 
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audits which should have been given to Programs A to J on 
the assumption that amount of listening is unrelated to pro- 
gram selection, the number of first- and second-choice audits 
for each program is added. 

If program selection is unrelated to amount of listening, 
then, among the 126 audits for persons hearing two programs 
only, 15.2 + 13.3 = 28.5 should have been given to Program A; 
13.8 + 12.4= 26.2 to Program B; 9.8+9.9=19.7 to Program 
C, and so on. 

At this point, therefore, it is possible to fill in two rows of 
a theoretical table corresponding to Table 1 as follows: 


TABLE 7 


Theoretical Distribution of Audits to Programs A to J by Those Who 
Heard One and Two Programs of the Ten Selected 





No. of 











Pro 
pro- oo red asin 
t 
in are ne eel ae ee On ne ae 
1 65 65 15.7 14.2 10.1 4.8 3.9 3.7 34 34 3.1 27 


2 63 126 28.4 263 19.6 9.9 8.1 7.6 7.0 7.0 6.6 5.5 








The purpose of the method is to make up a table similar 
to Table 1, but containing theoretical numbers of audits com- 
puted on the assumption that amount of listening is not re- 
lated to choice of program. The first two lines of such a 
table are given in Table 7. It now remains to indicate how 
the remaining three lines are filled in. 

ce. Persons who heard three programs. Table 1 shows that 
73 persons heard just three programs. On the assumptions 
previously made, among these 73 persons, (73) (.2408) = 17.6 
will choose Program A as first choice. Of these 17.6 persons, 
however, each will choose another program as second choice, 
and these second choice audits will be distributed among the 
remaining programs B to J in proportion to the frequencies 
of listening to Programs B to J as given in Table 2. So, in 
a manner entirely analogous to that discussed for persons 
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who heard two programs, a table similar to Table 6 can be 
built. This table is shown below as Table 8. 


TABLE 8 


Theoretical Distribution of First and Second Choice Audits for the 73 
Persons Who Heard Three Programs 








First- No. of Second-choice audits for remaining programs 
: first- 
choice choice 


program oan, AB OODSBSBFGHEHI J 





oom — Sl 86 i7 14 13 123 138 ili if 
a 16.0 | {aor 8383 15 123 323 18 18 3 &S 
a sa 11.4 3.3 2.9 ane ae ae” ee: ae 
ie 5.3 14 1.3 ine 4 <4 3 2 2 8 
= 4.4 12 1.1 a ae 2 2 S -«m- 2 
| gr 4.1 ii. 18 7.) a ee 2 Ss £8 A 
eee 3.8 1.0 8 Set 2 oo oe" ie ae | 
Wess 3.8 1.0 8 at ee ee ee Bee » a 
er 3.5 9 8 6 3 3 3B Sb 1 
J 3.1 9 7 Sa ee a er BS Ase 


73.0 15.8 14.5 116 64 49 4.7 41 40 38 3.2 





Now the 73 persons whose theoretical audit distribution on 
first and second choices is given in Table 8 also listened to a 
third program. On the assumptions already made, the third- 
choice audits of these people ought to be distributed among 
Programs A to J in proportion to the popularity of these pro- 
grams as indicated in Table 2. This is done as follows. 

It is clear from Table 8 that 5.0+5.1=10.1 persons heard 
both Programs A and B as either first or second choive; +.e., 
the second entry in the first row of the table shows that 5.1 
persons theoretically heard A as first choice and B as second 
choice, and the first entry of the second row shows that 5.0 
persons theoretically heard Program B as first choice and A 
as second. 5.1+5.0=10.1 persons, therefore, listened to A 
and B as their first two choices, irrespective of order. Now 
each of these 10.1 theoretical persons also listened to a third 
program. On the hypothesis being tested, the third-choice 
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audits of these 10.1 persons should be distributed among Pro- 
grams C, D, . . . J in proportion to the relative popularities 
of these programs as indicated in Table 2. Of these 10.1 
(10.1) (.1557) 








theoretical persons, therefore, T— (2408 + 2186) =2.8 per- 
sons* should theoretically have heard Program C as third 
choice. Similarly, Oo) =1.4 persons should theo- 


retically have heard Program D as third choice. The theo- 
retical audits for the remaining programs E to J are computed 
in a similar fashion. 

It may be seen from Table 8 that 3.3+3.6=6.9 persons 
heard Programs A and C as their first two choices irrespective 
of order. This is evident because the third entry in Table 8, 
first row, indicates that 3.6 persons heard A as first choice and 
C as second, and the first entry in the third row indicates that 
3.3 persons heard C as first choice and A as second. Now these 
6.9 persons also heard a third program, and their audits to the 
various possible third programs ought on our hypothesis to be 
in proportion to the popularities of these programs as in Table 
(6.9) (.2186) 





2. Thus theoretically I =2.5 persons theo- 





— (.2408 + .1557) 
retically heard Program B as third choice. Similarly, 
69) C88) =.8 persons theoretically heard Program D as 


third choice. The third-choice audits for the remaining pro- 
grams from E to J are computed in similar fashion. 

In order to determine the theoretical distribution of third- 
choice audits among all the programs, it is necessary to com- 
pute these audits for each possible group of two programs 
which might have been heard as first and second choices by the 
73 listeners. This has been done, and the results are given in 
Table 9. The reader should have no difficulty in filling in the 
entries by analogy from the examples already given. 

4 The divisor (1 — (.2408 + .2186)) is introduced to change the distribu- 


tion of audits to Programs C to J in Table 2 to a 100 per cent basis. See 
footnote 3. 
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TABLE 9—(Continued) 





7 
i 


rom * Third-choice audits for remaining programs 
and second choice 








an Sa OSE Oc DS 8 Oe ees 
F G ae et a Wa Shae nie i ea 
F H ae a Re oe ak ee eae o 0 
F I 63 Se AS SSA. ee 0 
F J Pa es Nae acme eae eee Ss 22x 
G H i irae a: a Te a me o 0 
aI aco, SR eRe we Oe 0 
GJ aa oe ie Sue oak ee a2. 
H I As Wes. Bick: Bik Oak kc 0 
H J Bo aca 48 a a Sy a 
i2 2" oe Gee ae ee 

73.0 12.7 126 113 62 60 5.5 49 49 44 4.0 





Table 9 gives the theoretical distribution of third-choice 
audits for all persons who theoretically heard every combina- 
tion of two programs on their first two choices. The totals at 
the bottom of Table 9, therefore, indicate the theoretical dis- 
tribution of third-choice audits to Programs A to J for all per- 
sons hearing three programs, just as the totals at the bottom of 
Table 6 indicate the theoretical distribution of second-choice 
audits for persons who heard two programs only. 

From Tables 8 and 9, it is now possible to determine the 
theoretical distribution of all audits to Programs A to J by 
persons hearing three programs. Table 8 indicates that 17.6 
persons theoretically heard Program A as first choice, and that 
15.8 heard it as second choice, while Table 9 indicates that 
12.7 persons theoretically heard it as third choice. Conse- 
quently, the theoretical number of audits given Program A 
must be 17.6+15.8+12.7=46.1. The theoretical number of 
audits for the remaining programs, for persons who heard 3 
programs, are computed in similar fashion. These figures, 
together with the theoretical audits for persons hearing four 
and five programs, are given in Table 10. 
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TABLE 10 


Theoretical Distribution of Audits to Ten Chosen Evening Programs By 
the Number of These Ten Programs Heard Completely 








ar No. of Me. et Programs 

listen- th 

ams dits 
aa” «a Se OS eS FS ee es 





1 65 65 15.7 14.2 10.1 48 3.9 3.7 34 34 3.1 2.7 
2 63 126 28.4 263 19.6 9.9 81 7.6 7.0 7.0 66 5.5 
3 73 ©6219 46.1 43.1 34.3 17.9 15.3 14.3 12.8 12.7 11.7 10.3 
4 51 204 38.2 37.1 30.8 17.7 15.3 14.5 13.5 13.6 12.7 11.1 
5 30 =6150 25.5 24.8 21.7 14.2 12.1 11.6 10.7 10.7 9.8 8.9 


Total 282 764 153.9145.5116.5 64.5 54.7 51.7 47.4 47.4 43.9 38.5 





d. Persons who heard four and five programs. The method 
of computing the theoretical number of audits for persons 
hearing four and five programs is entirely similar to that for 
computing the theoretical distribution for those who heard 
three programs. It is necessary, however, to carry the process 
of subdivision further. The analysis for those hearing four 
programs would begin with a table similar to Table 9 but 
based on 51 persons. The number of theoretical audits for 
each combination of three programs would then be computed 
and the remaining audits distributed among the remaining 
seven programs. The sums of columns in the new table would 
then give the distribution of fourth-choice audits. 

These computations have been made and are presented for 
the data of Tables 1 and 2 in Table 10. 


TEST OF SIGNIFICANCE 


Tables 1 and 10 are the basis for deciding whether amount 
of listening is related to program selection. Table 1 gives the 
actual distribution of audits to Programs A to J for groups of 
persons with different amounts of listening. Table 10 gives 
the distribution of audits which would be expected if amount 
of listening were unrelated to program selection. The ques- 
tion now is, are the actual and theoretical distributions similar 
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enough so that the differences between them could have arisen 
by chance in sampling from a population in which amount of 
listening is unrelated to program selection, or is the divergence 
between actual and theoretical great enough to be statistically 
significant ? 

This question can be answered for any individual program, 
or for the entire group of programs, by employing the Chi- 
square test. For example, consider the actual and theoretical 
distributions of audits to Program A. To make the test, com- 


— #)2 
pute the quantity X? atte where o is the observed 


number of audits, and ¢ is the theoretical number, and the 
summation extends over the five classes. For Program A this 
quantity will be (12-—15.7)?/15.7 + (37 —28.4)?/28.4+ (58 — 
46.1)?/46.1+ (47 —38.2)?/38.2 + (30 — 25.5) */25.5=9.36. There 
are five degrees of freedom available for making the test, since 
the sum of the theoretical and actual frequencies are not made 
to agree. For five degrees of freedom the probability that X? 
will exceed 9.36 by chance is found by application of Fisher’s 
table to be about .09, and the divergence between actual and 
theoretical frequencies is thus not quite significant on the .05 
probability level. A divergence as great as this would happen 
by chance 9 times out of 100 even when there is no true differ- 
ence. 
INTERPRETATION 


The ten programs considered in this paper are as follows: 

A—Jack Benny (General Foods—Jello) 

B—Joe Penner (Bakers Broadcast — Fleischmann’s 
Yeast) 

C—Eddie Cantor with ‘‘Parkyakarkas’’ (Pebeco) 

D—Walter Winchell (The Jergen’s Program) 

E—Major Bowes Amateur Hour (Standard Brands) 

F—Adventures of Sherlock Holmes (G. Washington 
Coffee) 

G—Ford Symphony Orchestra with Schipa 

H—Wayne King’s Orchestra (sustaining) 
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I—Tomorrow’s Tribune and Sports Shots (Hamlin’s 
Wizard Oil) 
J—Wayne King’s Orchestra (Lady Esther) 

The actual and theoretical distributions for each of these 
programs have been tested. For two of these programs, there 
is distinct evidence that amount of listening is significantly re- 
lated to choice of the program. The value of X* for Joe Pen- 
ner’s broadcast is 15.38, a highly significant value, arising 
from the fact that many less persons hearing only one pro- 
gram listen to Joe Penner than would be expected if amount 
of listening were unrelated to program selection. Tomorrow’s 
Tribune and Sports Shots, for which X? = 58.49, is heard much 
more frequently by persons hearing only one program. The 
specialized nature of the latter program probably accounts 
for its great popularity with persons who hear but a single 
program. 

The value of chi-square computed for all of the programs 
together, moreover, indicates that for the entire group of pro- 
grams amount of listening is significantly related to program 
selection. For all the programs together the value of X? is 
122.53, whereas the value which would be exceeded by chance 
only one time in 100 is about 78.5 For the table as a whole, 
therefore, the divergence of actual from theoretical frequen- 
cies is highly significant. The divergence is an expression of 
the fact that the most popular programs are less likely to be 
chosen by persons who hear only one or two programs, while 
the less popular programs are more likely to be chosen by 
those persons than by people who hear three or more programs. 


5 There are 50 -1=49 degrees of freedom available for the comparison, 
because the grand total of actual and theoretical frequencies must be 
equal, 














NOTES AND NEWS 


The Conference for the Education of the Gifted will be held at Teachers 
College, Columbia University, Friday and Saturday, December 13 and 14, 
1940. This Conference is held in memory of Leta 8. Hollingworth, Pro- 
fessor of Education, Teachers College. The morning session on Friday 
will be held in Horace Mann Auditorium with the general theme of ‘‘ The 
Education of Leaders in a Democracy.’’ The afternoon session will con- 
sist of a series of seminars. Dr. Arthur I. Gates, Head of the Depart- 
ment of Educational Psychology, Teachers College, will act as chairman 
of the evening session. Dr. Rudolf Pintner will speak on ‘‘The Problem 
of the Gifted in Its Relation to the Larger Problem of Differential Psy- 
chology.’’ The Saturday morning session will consist of demonstration 
of a class of rapid learners from Speyer School followed by discussions by 
educators, parents and laymen. 


Matching Youth and Jobs is the title of a study prepared by Howard M. 
Bell for the American Youth Commission and published by the American 
Council on Education. That the occupational and educational needs of 
American youth are great is graphically brought out. There are nearly 
4,000,000 young people between the ages of fifteen and twenty-four out of 
work and out of school today, with 1,750,000 more finishing or leaving 
school every year to start job-hunting. It was found that less than one 
in four have had any practical help in finding out what work fits them 
best. The author avoids a controversial tone, yet he believes that the 
community itself is chiefly to blame for such a condition and that the 
school may well be the logical center for vocational guidance, education 
and even placement. He deplores the notion that there is something more 
dignified about a white-collar job than a shop job, and that there is greater 
culture in training to work with the brain than in training to work with 
the hands. He asks for a well balanced curriculum which would include 
vocational education as an integral part of general education. The author 
also stresses the need of full-time counselors both in the schools as well 
as in employment offices. 


The Twelfth Annual Meeting of the Eastern Association of College 
Deans and Advisers of Men will be held in Haddon Hall, Atlantic City, 
New Jersey, on Saturday, November 23, 1940. Among the speakers at the 
morning session will be Dr. James A. McClintock, Director of Personnel, 
Drew University. His topic will be ‘‘ Trends in Student Guidance.’’ Dr. 
Robert G. Bernreuter, Director of Psycho-Educational Clinic, Pennsylvania 
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State College, will speak on ‘‘Student Guidance on Various Levels, Tech- 
niques and Reasonable Expectations.’’ The afternoon session will consist 
of a round table discussion of student guidance organization and problems 
under the chairmanship of Dean William L. Machmer, President of the 
Association. 


The Child Research Clinie of The Woods Schools of Langhorne, Pa., held 
its Seventh Institute on the Exceptional Child on October 22, 1940. The 
topic under discussion, ‘‘The Life of an Exceptional Child’’ covered the 
following phases: ‘‘ Life Begins,’’ ‘‘ Infant Training,’’ ‘‘ Primary School 
Training,’’ ‘‘Growing Up,’’ ‘‘ Adolescence’’ and ‘‘ Life Span.’’ 


Dr. Herbert Woodrow, Chairman of the Department of Psychology, Uni- 
versity of Illinois, has been made the President of the American Psycho- 
logical Association. In a recent issue of the Psychological Monographs 
(Vol. 52, No. 3) appear several Studies in Quantitative Psychology as 
follows: ‘‘The Problem of the Interrelationship of Determining Condi- 
tions’’ by Dr. Woodrow; ‘‘The Effect of a Fixed Change in Difficulty at 
Various Levels of Difficulty’’ by John M. Willmann; ‘‘The Measurement 
of Memory on an Absolute Seale’’ by Harriett C. Shurrager; ‘‘A Factor 
Analysis of Forty Character Tests’’ by Hubert E. Brogden; and ‘‘ The 
Effects of Practice upon Standard Errors of Estimate’’ by Leland P. 
Bradford. ' 


The Stanford University Press has recently announced the publication 
of a new test by Frederick L. Pond entitled ‘‘Inventory of Reading Ex- 
periences.’’ It was designed for high school and college students to 
appraise the quality and quantity of reading experiences. Part I, the 
Qualitative Inventory, represents an analysis of the types of reading mate- 
rials and types of motivation discussed extensively in educational litera- 
ture. In weighting the responses in Part II, the Quantitative Inventory, 
a criterion was secured from the four-weeks’ diary recordings of twelfth- 
grade students under ten types of reading activity: hours spent each day 
in reading or study, visits made to a library, number of magazines read 
each day, number of books completed, number of entire evenings spent in 
reading or study, days in which there was conversation about things to 
read, number of times a dictionary was consulted, books borrowed to read, 
ete. Four weeks after the diary records were closed, the Quantitative In- 
ventory was administered to the same students. The testing of inner 
consisteney by the methods of split-halves, stepped up by the Spearman- 
Brown formula, for a group of 279 eleventh-grade students, produced a 
reliability for Part I, the Qualitative Inventory, of .922+.006 and for 
Part II, the Quantitative Inventory, of .911 + .007. 














BOOK REVIEWS 


LAZARSFELD, PauL F. Radio and the Printed Page. New York: Duell, 
Sloan and Pearce, Inc., 1940. xviii+354 pp. $4.00. 

In his introductory remarks the writer suggests several considerations 
which make a better understanding of the social and educational impli- 
cations of radio imperative. The preservation of democracy in view of 
our ever increasing centralization of economic production; problems of 
relative literacy, i.e., literacy in terms of understanding the problems con- 
fronting us here and now; the increasing tendency of people to ‘‘ write 
in’’ to radio stations and their congressmen may serve as examples of such 
problems. 

The first chapter is a study of the relation of the cultural level of the 
home to listening habits. In one investigation the data from 300,000 tele- 
phone calls were analyzed to bring out the relation between cultural level 
of the home (inferred from income level) and the amount of listening at 
different times of the day and on different days of the week. In another 
study conducted by the owners of a Buffalo newspaper and broadcasting 
station, the data were analyzed to show the relation between cultural level 
and serious listening. In this study face-to-face interviews were used, 
thus all social strata could be reached and the selective factor operating 
in telephone surveys eliminated. Data from other sources are similarly 
analyzed. The writer states: ‘‘The evidence adduced reveals that, as we 
go down the cultural scale, there is more and more radio listening but 
less and less serious listening.’’ Thus radio is largely failing to justify 
the hopes held for it as an educational agency. It is pointed out, however, 
that print did not raise the intellectual standard of living simply by virtue 
of coming into being. Print became educationally significant because it 
was used for educational and cultural purposes. So ‘‘ Forces outside of 
radio will have to be brought into operation to provide vehicles and estab- 
lish audiences for serious broadcasts.’’ 

The second chapter shows that radio does, to a great extent, provide 
informational programs, but of a sort that educators, at least, refuse to 
regard as educational. Here the main emphasis is placed on the quiz 
type df program. The appeal of the ‘‘ Professor Quiz’’ program is dis- 
cussed in considerable detail. 

In the third chapter serious listening is discussed with a view to in- 
creasing it. Also in this chapter another interesting type of research is 
brought in. Two rural counties, differing in wealth, background, cultural 
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and social activities and the like, are compared for their interest in serious 
listening. 

The fourth chapter presents a survey of the conditions under which 
people choose to read or to listen, the fifth deals with radio and the printed 
page as sources of news and the sixth points toward the influence that 
radio has, actually and potentially, on reading. 

The reader of the foregoing part of this review, but not the reader of 
the book, may be annoyed by the emphasis upon serious listening. As 
used by the writer neither the term nor the concept is in the least offensive. 
Nor can a mere review present more than a hint of the richness of inter- 
pretation in the book itself. As for the data, they are of varying degrees 
of value. Some were obtained directly by the Office of Radio Research, 
of which the author is director, but many were obtained from other sources, 
and re-analyzed for the purposes of the present study. Thus the book has 
the advantage of synthesizing much research, but the disadvantage that 
the research was not uniformly under the control of one person or ageacy. 
The author recognizes this limitation, states the sources of his data, and 
exercises caution and restraint in his interpretations. Such a large under- 
taking as the book represents must of necessity offer but an incomplete 
picture. The outline is ambitious, many of the details are sketchy, yet not 
so sketchy as one might expect in a work of this scope. It is more a 
matter of some phases of the study being incompletely worked out and 
some problems being left unattacked. In some instances the method em- 
ployed was not the ideal one, but the only one available in view of prac- 
tical considerations. Of far greater importance is the fact that the work 
reported is well conceived, the problems attacked are significant ones, and 
the data presented are such as to lead to definite and useful conclusions. 
A book bringing together a vast amount of research centered about the 
topic announced by the title, cannot fail to be significant for a wide 
variety of interests. For the advertiser, the educator, the radio specialist 
of almost every type, and for the citizen simply as citizen there is much 
of interest and value. 

There are 57 tables and 17 charts listed at the end of the book. These 
are appropriately located throughout the body of the text. 

Amos C, ANDERSON, 
Ohio University 


CANTRIL, HADLEY (with the assistance of Haze. GaupeT and Herta 
Herzog). The Invasion from Mars. Princeton, Princeton Univer- 
sity Press, 1940. xv+228. 1940. 

The Office of Radio Research—former!; at Princeton, now at Colum- 
bia—has been attacking a series of socio-psychological problems involved 
in analyzing the effects of radio broadcasting. Most of their work follows 
a pre-arranged plan but this volume testifies to an establishment suffi- 
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ciently flexible to permit an opportunistic seizure of unexpectedly available 
raw material. Few social psychologists would hesitate to grasp an oppor- 
tunity for the first-hand study of panic; and few, alas, will ever encounter 
such an opportunity. This makes it all the more fortunate that the 
Princeton research group had available trained field investigators and 
tested techniques to put into the field within a week of the occurrence. 

In the fall of 1938, it will be recalled, the Mercury Theater of Orson 
Welles broadcast his version of H. G. Wells’s ‘‘ The War of the Worlds.’’ 
It will also be recalled that the next day’s newspapers carried accounts of 
local panies complete with choked telephone switchboards, harassed police 
forces, and fleeing citizens. A special grant of funds made it possible 
for the Princeton group to get investigators into the field without undue 
delay and to make an intensive study of a sample of frightened persons 
in the nearby region. The analysis of the results of this survey provides 
the materials for this volume. The result will be of direct interest and 
importance to all those who hold that social psychology is evolving along 
lines which demand empirical evidence in place of speculation on the basis 
of uncontrolled observation. 

The use of a ‘‘pluralistic’’ approach made it possible to check the re- 
sults of 135 detailed interviews against various other sources of informa- 
tion. Estimates indicate that more than a million persons were seriously 
upset by the broadcast, and 100 of the interviews were obtained with 
those who confessedly belonged in this category. These are presented in 
some detail. 

The volume opens with a presentation of the full text of the broadcast 
which was heard by at least 6,000,000 people (although it was competing 
with the broadcast which featured Charlie McCarthy). The estimate of 
the total number who were seriously frightened is based upon such varied 
resources as the polls of American Institute of Public Opinion, reports of 
high school administrators, telephone volume, mail volume, and analyses 
of newspaper clippings. Reasons for the frightening effects include the 
accepted status of the radio as a vehicle for announcements, the prestige 
of the speakers, the comprehensibility of specific incidents, the total at- 
mosphere of the broadcast, and late tuning which eliminated the prelimi- 
nary announcement. 

In general the study shows that those who were frightened were those 
who failed to make adequate checks on the veracity of the performance. 
Education and economic standards are shown to be conditions of the 
failure to check, both of them being positively related to the ‘‘ critical 
ability’’ which is found to be the most important variable related to the 
panic reaction. Personal susceptibility and the unusual listening situation 
are shown to be negative conditions for the employment of this critical 
ability. The influence of unsettled social, economic, and political eondi- 
tions is considered as a factor in the panic reaction. 




















BOOK REVIEWS 859 


The authors conclude that they have found ‘‘no single observable 
variable consistently related to the (panic) reaction, although a lack of 
critical ability seemed particularly conducive to fear in a large propor- 
tion of the population.’’ Failure to doubt or to discredit the broadcast 
is traced to utter lack of standards of criticality, to inadequate standards 
of criticality or of the authenticity of information, or to agreement 
between the broadcast and latent expectations of the respondents. The 
violent nature of the reaction is explained by the authors to have been 
due to ‘‘the enormous felt ego-involvement the situation created and to 
the complete inability of the individual to alleviate or control the conse- 
quences of the invasion.’’ 

Granted that the tools employed were necessarily those of the pioneer, 
and granted that the sample was small, the achievement represented in the 
volume is undeniable. This close-range study of a panic patently consti- 
tutes a genuine contribution to contemporary social psychology. 

JOHN G. JENKINS, 
University of Maryland 
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